Fossil

Check-in [04eef95223]
Login

Check-in [04eef95223]

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Update the performance stats webpage.
Downloads: Tarball | ZIP archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 04eef9522386a59e31063b9300a5b9670c2be146
User & Date: drh 2015-02-28 14:46:19.775
Context
2015-02-28
14:55
Further updates to the performance stats page: "yrs" becomes "years". Give a specific example of the commit bandwidth. ... (check-in: 2762fecd9e user: drh tags: trunk)
14:46
Update the performance stats webpage. ... (check-in: 04eef95223 user: drh tags: trunk)
14:15
Automatically run extra delta-compression and vacuum a repository after a clone. And change the page size to 8192 if there are more than 1000 pages. ... (check-in: 35c25558cb user: drh tags: trunk)
Changes
Unified Diff Ignore Whitespace Patch
Changes to www/stats.wiki.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
<title>Fossil Performance</title>
<h1 align="center">Performance Statistics</h1>

The questions will inevitably arise:  How does Fossil perform? 
Does it use a lot of disk space or bandwidth?  Is it scalable?

In an attempt to answers these questions, this report looks at several
projects that use fossil for configuration management and examines how
well they are working.  The following table is a summary of the results.
(Last updated on 2012-02-26.)
Explanation and analysis follows the table.

<table border=1>
<tr>
<th>Project</th>
<th>Number Of Artifacts</th>
<th>Number Of Check-ins</th>
<th>Project&nbsp;Duration<br>(as of 2009-08-23)</th>
<th>Average Check-ins Per Day</th>
<th>Uncompressed Size</th>
<th>Repository Size</th>
<th>Compression Ratio</th>
<th>Clone Bandwidth</th>
</tr>

<tr align="center">
<td>[http://www.sqlite.org/src/timeline | SQLite]
<td>41113
<td>9943
<td>4290&nbsp;days<br>11.75&nbsp;yrs
<td>2.32
<td>2.09&nbsp;GB
<td>33.2&nbsp;MB
<td>63:1
<td>23.2&nbsp;MB
</tr>

<tr align="center">
<td>[http://core.tcl.tk/tcl/timeline | TCL]
<td>74806
<td>13541
<td>5085&nbsp;days<br>13.92&nbsp;yrs
<td>2.66
<td>5.2&nbsp;GB
<td>86&nbsp;MB
<td>60:1
<td>67.0&nbsp;MB
</tr>

<tr align="center">
<td>[/timeline | Fossil]
<td>15561
<td>3764
<td>1681&nbsp;days<br>4.6&nbsp;yrs
<td>2.24
<td>721&nbsp;MB
<td>18.8&nbsp;MB
<td>38:1
<td>12.0&nbsp;MB
</tr>

<tr align="center">
<td>[http://www.sqlite.org/slt/timeline | SLT]
<td>2174
<td>100
<td>1183&nbsp;days<br>3.24&nbsp;yrs
<td>0.08
<td>1.94&nbsp;GB
<td>143&nbsp;MB
<td>12:1
<td>141&nbsp;MB
</tr>

<tr align="center">
<td>[http://www.sqlite.org/th3.html | TH3]
<td>5624
<td>1472
<td>1248&nbsp;days<br>3.42&nbsp;yrs
<td>1.78
<td>252&nbsp;MB
<td>12.5&nbsp;MB
<td>20:1
<td>12.2&nbsp;MB
</tr>

<tr align="center">
<td>[http://www.sqlite.org/docsrc/timeline | SQLite Docs]
<td>3664
<td>1003
<td>1567&nbsp;days<br>4.29&nbsp;yrs
<td>0.64
<td>108&nbsp;MB
<td>6.6&nbsp;MB
<td>16:1
<td>5.71&nbsp;MB
</tr>

</table>

<h2>Measured Attributes</h2>

In Fossil, every version of every file, every wiki page, every change to









|







|
<








|
|
|
<
|
|
|
|




|
|
|
<
|
|
|
|




|
|
|
|
|
<
|
|




|
|
|
<
|
|
|
|




|
|
|
<
|
|
|
|




|
|
|
<
|
|
|
|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

19
20
21
22
23
24
25
26
27
28
29

30
31
32
33
34
35
36
37
38
39
40

41
42
43
44
45
46
47
48
49
50
51
52
53

54
55
56
57
58
59
60
61
62

63
64
65
66
67
68
69
70
71
72
73

74
75
76
77
78
79
80
81
82
83
84

85
86
87
88
89
90
91
92
93
94
95
<title>Fossil Performance</title>
<h1 align="center">Performance Statistics</h1>

The questions will inevitably arise:  How does Fossil perform? 
Does it use a lot of disk space or bandwidth?  Is it scalable?

In an attempt to answers these questions, this report looks at several
projects that use fossil for configuration management and examines how
well they are working.  The following table is a summary of the results.
(Last updated on 2015-02-28.)
Explanation and analysis follows the table.

<table border=1>
<tr>
<th>Project</th>
<th>Number Of Artifacts</th>
<th>Number Of Check-ins</th>
<th>Project&nbsp;Duration<br>(as of 2015-02-28)</th>

<th>Uncompressed Size</th>
<th>Repository Size</th>
<th>Compression Ratio</th>
<th>Clone Bandwidth</th>
</tr>

<tr align="center">
<td>[http://www.sqlite.org/src/timeline | SQLite]
<td>56089
<td>14357
<td>5388&nbsp;days<br>14.75&nbsp;yrs

<td>3.4&nbsp;GB
<td>45.5&nbsp;MB
<td>73:1
<td>29.9&nbsp;MB
</tr>

<tr align="center">
<td>[http://core.tcl.tk/tcl/timeline | TCL]
<td>139662
<td>18125
<td>6183&nbsp;days<br>16.93&nbsp;yrs

<td>6.6&nbsp;GB
<td>192.6&nbsp;MB
<td>34:1
<td>117.1&nbsp;MB
</tr>

<tr align="center">
<td>[/timeline | Fossil]
<td>29937
<td>8238
<td>2779&nbsp;days<br>7.61&nbsp;yrs
<td>2.3&nbsp;GB
<td>34.0&nbsp;MB

<td>67:1
<td>21.5&nbsp;MB
</tr>

<tr align="center">
<td>[http://www.sqlite.org/slt/timeline | SLT]
<td>2278
<td>136
<td>2282&nbsp;days<br>6.25&nbsp;yrs

<td>2.0&nbsp;GB
<td>144.7&nbsp;MB
<td>13:1
<td>142.1&nbsp;MB
</tr>

<tr align="center">
<td>[http://www.sqlite.org/th3.html | TH3]
<td>8729
<td>2406
<td>2347&nbsp;days<br>6.43&nbsp;yrs

<td>386&nbsp;MB
<td>15.2&nbsp;MB
<td>25:1
<td>12.6&nbsp;MB
</tr>

<tr align="center">
<td>[http://www.sqlite.org/docsrc/timeline | SQLite Docs]
<td>5503
<td>1631
<td>2666&nbsp;days<br>7.30&nbsp;yrs

<td>194&nbsp;MB
<td>10.9&nbsp;MB
<td>17:1
<td>8.37&nbsp;MB
</tr>

</table>

<h2>Measured Attributes</h2>

In Fossil, every version of every file, every wiki page, every change to
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
changed, it still only counts as one check-in.

The "Uncompressed Size" is the total size of all the artifacts within
the repository assuming they were all uncompressed and stored 
separately on the disk.  Fossil makes use of delta compression between related
versions of the same file, and then uses zlib compression on the resulting
deltas.  The total resulting repository size is shown after the uncompressed
size.  For this chart, "fossil rebuild --compress" was run on each repository
prior to measuring its compressed size.  Repository sizes would typically 
be 20% larger without that rebuild.

On the right end of the table, we show the "Clone Bandwidth".  This is the
total number of bytes sent from server back to the client.  The number of
bytes sent from client to server is negligible in comparison.
These byte counts include HTTP protocol overhead.

In the table and throughout this article,
"GB" means gigabytes (10<sup><small>9</small></sup> bytes)
not <a href="http://en.wikipedia.org/wiki/Gibibyte">gibibytes</a>
(2<sup><small>30</small></sup> bytes).  Similarly, "MB" and "KB"
means megabytes and kilobytes, not mebibytes and kibibytes.

<h2>Analysis And Supplemental Data</h2>

Perhaps the two most interesting datapoints in the above table are SQLite
and SLT.  SQLite is a long-running project with long revision chains.
Some of the files in SQLite have been edited over a thousand times.
Each of these edits is stored as a delta, and hence the SQLite project
gets excellent 63:1 compression.  SLT, on the other hand, consists of
many large (megabyte-sized) SQL scripts that have one or maybe two
edits each.  There is very little delta compression occurring and so the
overall repository compression ratio is much lower.  Note also that
quite a bit more bandwidth is required to clone SLT than SQLite.

For the first nine years of its development, SQLite was versioned by CVS.
The resulting CVS repository measured over 320MB in size.  So, the
developers were surprised to see that this entire project could be cloned in
fossil using only about 23.2MB of network traffic.  (This 23.2MB includes
all the changes to SQLite that have been made since the conversion from
CVS.  Of those changes are omitted, the clone bandwidth drops to 13MB.)
The "sync" protocol
used by fossil has turned out to be surprisingly efficient.  A typical
check-in on SQLite might use 3 or 4KB of network bandwidth total, hardly
worth measuring.  The sync protocol is efficient enough that, once cloned,
Fossil could easily be used over a dial-up connection.







<
|
<


















|







|
<
<
|





108
109
110
111
112
113
114

115

116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142


143
144
145
146
147
148
changed, it still only counts as one check-in.

The "Uncompressed Size" is the total size of all the artifacts within
the repository assuming they were all uncompressed and stored 
separately on the disk.  Fossil makes use of delta compression between related
versions of the same file, and then uses zlib compression on the resulting
deltas.  The total resulting repository size is shown after the uncompressed

size.


On the right end of the table, we show the "Clone Bandwidth".  This is the
total number of bytes sent from server back to the client.  The number of
bytes sent from client to server is negligible in comparison.
These byte counts include HTTP protocol overhead.

In the table and throughout this article,
"GB" means gigabytes (10<sup><small>9</small></sup> bytes)
not <a href="http://en.wikipedia.org/wiki/Gibibyte">gibibytes</a>
(2<sup><small>30</small></sup> bytes).  Similarly, "MB" and "KB"
means megabytes and kilobytes, not mebibytes and kibibytes.

<h2>Analysis And Supplemental Data</h2>

Perhaps the two most interesting datapoints in the above table are SQLite
and SLT.  SQLite is a long-running project with long revision chains.
Some of the files in SQLite have been edited over a thousand times.
Each of these edits is stored as a delta, and hence the SQLite project
gets excellent 73:1 compression.  SLT, on the other hand, consists of
many large (megabyte-sized) SQL scripts that have one or maybe two
edits each.  There is very little delta compression occurring and so the
overall repository compression ratio is much lower.  Note also that
quite a bit more bandwidth is required to clone SLT than SQLite.

For the first nine years of its development, SQLite was versioned by CVS.
The resulting CVS repository measured over 320MB in size.  So, the
developers were surprised to see that the equivalent Fossil project (the


first nine years on SQLite) would clone with only 13MB of bandwidth.
The "sync" protocol
used by fossil has turned out to be surprisingly efficient.  A typical
check-in on SQLite might use 3 or 4KB of network bandwidth total, hardly
worth measuring.  The sync protocol is efficient enough that, once cloned,
Fossil could easily be used over a dial-up connection.