Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
| Comment: | Additional documentation updates. |
|---|---|
| Downloads: | Tarball | ZIP archive |
| Timelines: | family | ancestors | descendants | both | trunk |
| Files: | files | file ages | folders |
| SHA1: |
adc0b3bfb0daaa7677179c5c48b062d8 |
| User & Date: | drh 2008-07-15 14:33:48.000 |
Context
|
2008-07-15
| ||
| 15:34 | Update the SQLite implementation to the 3.6.0 prerelease. check-in: d19a05f2a2 user: drh tags: trunk | |
| 14:33 | Additional documentation updates. check-in: adc0b3bfb0 user: drh tags: trunk | |
| 13:46 | Documentation updates. check-in: 8d8a41d195 user: drh tags: trunk | |
Changes
Changes to www/delta_encoder_algorithm.wiki.
| ︙ | ︙ | |||
13 14 15 16 17 18 19 | <a href="index.wiki">fossil</a> itself, or on tools compatible with it. The exact format of the generated byte-sequences, while in general not necessary to understand encoder operation, can be found in the companion specification titled "<a href="delta_format.wiki">Fossil Delta Format</a>". </p> | | | 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | <a href="index.wiki">fossil</a> itself, or on tools compatible with it. The exact format of the generated byte-sequences, while in general not necessary to understand encoder operation, can be found in the companion specification titled "<a href="delta_format.wiki">Fossil Delta Format</a>". </p> <p>The algorithm is inspired by <a href="http://samba.anu.edu.au/rsync/">rsync</a>.</p> <a name="argresparam"></a><h2>1.0 Arguments, Results, and Parameters</h2> <p>The encoder takes two byte-sequences as input, the "original", and the "target", and returns a single byte-sequence containing the "delta" which transforms the original into the target upon its |
| ︙ | ︙ |
Changes to www/fileformat.wiki.
1 2 3 4 5 6 | <h1 align="center"> Fossil File Formats </h1> <p> The global state of a fossil repository is determined by an unordered | > > > > > | > | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | <h1 align="center"> Fossil File Formats </h1> <p>The state of a fossil repository is kept simple so that it can endure in useful form for decades or centuries. A fossil repository is intended to be readable, searchable, and extensible by people not yet born.</p> <p> The global state of a fossil repository is determined by an unordered set of <i>artifacts</i>. An artifact might be a source code file, the text of a wiki page, part of a trouble ticket, or one of several special control artifacts used to show the relationships between other artifacts within the project. Each artifact is normally represented on disk as a separate file. Artifacts can be text or binary. </p> <p> Each artifact in the repository is named by its SHA1 hash. No prefixes or meta information is added to a artifact before its hash is computed. The name of a artifact in the repository is exactly the same SHA1 hash that is computed by sha1sum |
| ︙ | ︙ | |||
165 166 167 168 169 170 171 | </p> <h2>2.0 Clusters</h2> <p> A cluster is a artifact that declares the existance of other artifacts. Clusters are used during repository synchronization to help | | > > | 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 | </p> <h2>2.0 Clusters</h2> <p> A cluster is a artifact that declares the existance of other artifacts. Clusters are used during repository synchronization to help reduce network traffic. As such, clusters are an optimization and may be removed from a repository without loss or damage to the underlying project code. </p> <p> Clusters follow a syntax that is very similar to manifests. A Cluster is a line-oriented text file. Newline characters (ASCII 0x0a) separate the artifact into cards. Each card begins with a single character "card type". Zero or more arguments may follow |
| ︙ | ︙ |
Changes to www/index.wiki.
1 2 3 4 5 6 7 8 | <h1>Fossil: Distributed Revision Control, Wiki, and Bug-Tracking</h1> <p> Fossil is a new <a href="http://en.wikipedia.org/wiki/Revision_control"> distributed software revision control system</a> that includes an integrated <a href="http://en.wikipedia.org/wiki/Wiki">Wiki</a> and an integrated <a href="http://en.wikipedia.org/wiki/Bugtracker"> | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | <h1>Fossil: Distributed Revision Control, Wiki, and Bug-Tracking</h1> <p> Fossil is a new <a href="http://en.wikipedia.org/wiki/Revision_control"> distributed software revision control system</a> that includes an integrated <a href="http://en.wikipedia.org/wiki/Wiki">Wiki</a> and an integrated <a href="http://en.wikipedia.org/wiki/Bugtracker"> bug-tracking system</a> all in a single, easy-to-use, stand-alone executable. (NB: The bug-tracker component is not yet completely functional, but we expect it to be available soon.) Fossil is <a href="http://www.fossil-scm.org/fossil/timeline">self-hosting</a> since 2007-07-21 on <a href="http://www.hwaci.com/cgi-bin/fossil/timeline">two separate servers</a>. |
| ︙ | ︙ | |||
33 34 35 36 37 38 39 | <a href="http://subversion.tigris.org/">subversion</a>), or operations on local repositories, or all three at the same time</li> <li>Integrated bug tracking and wiki, along the lines of <a href="http://www.cvstrac.org/">CVSTrac</a> and <a href="http://www.edgewall.com/trac/">Trac</a>.</li> <li>Built-in web interface that supports deep archaeological digs through | | | | | | > | | | | | < | | | 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | <a href="http://subversion.tigris.org/">subversion</a>), or operations on local repositories, or all three at the same time</li> <li>Integrated bug tracking and wiki, along the lines of <a href="http://www.cvstrac.org/">CVSTrac</a> and <a href="http://www.edgewall.com/trac/">Trac</a>.</li> <li>Built-in web interface that supports deep archaeological digs through the project history.</li> <li>All network communication via <a href="http://en.wikipedia.org/wiki/HTTP">HTTP</a> (so that everything works from behind restrictive firewalls).</li> <li>Everything (client, server, and utilities) is included in a single self-contained executable - trivial to install</li> <li>Server runs as <a href="http://www.w3.org/CGI/">CGI</a>, using <a href="http://en.wikipedia.org/wiki/inetd">inetd</a>/<a href="http://www.xinetd.org/">xinetd</a>, or using its own built-in, standalone web server.</li> <li>An entire project contained in single disk file (which also happens to be an <a href="http://www.sqlite.org/">SQLite</a> database.)</li> <li>Trivial to setup and administer</li> <li>Files and versions are identified by their <a href="http://en.wikipedia.org/wiki/SHA-1">SHA1</a> signature.</a> Any unique prefix is sufficient to identify a file or version - usually the first 4 or 5 characters suffice.</li> <li>The <a href="fileformat.wiki">file format</a> designed to be enduring. It is deliberately kept simple, requiring nothing more complex than a text editor and an SHA1 checksum generator to encode or decode.</li> <li>Automatic <a href="selfcheck.wiki">self-check</a> on repository changes makes it exceedingly unlikely that data will ever be lost because of a software bug.</li> </ul> <p>Objectives Of Fossil:</p> <ul> <li>Fossil should be ridiculously easy to <a href="build.wiki">install</a> and <a href="quickstart.wiki">operate</a>.</li> <li>With fossil, it should be possible (and <a href="quickstart.wiki#serversetup">easy</a>) to set up a project on an inexpensive shared-hosting ISP (example: <a href="http://www.he.net/hosting.html">Hurricane Electric</a>) that provides nothing more than web space and CGI capability. Here is <a href="http://www.hwaci.com/cgi-bin/fossil/timeline">a demo</a>.</li> <li>Fossil should provide in-depth historical and status information about the project through a web interface</li> <li>Fossil should provide an historical record of a project that endures for decades or centuries and across multiple generations of hardward and software.</li> <li>Fossil should be easily adaptable to different workflows. Fossil implements mechanism, not policy.</li> </ul> <p>User Links:</p> <ul> <li>The <a href="concepts.wiki">concepts</b> behind fossil</li> <li><a href="build.wiki">Building And Installing</a></li> |
| ︙ | ︙ |
Changes to www/sync.wiki.
1 2 3 4 5 6 7 8 9 | <h1 align="center">The Fossil Sync Protocol</h1> <p>Fossil supports commands <b>push</b>, <b>pull</b>, and <b>sync</b> for transferring information from one repository to another. The command is run on the client repository. A URL for the server repository is specified as part of the command. This document describes what happens behind the scenes in order to synchronize the information on the two repositories.</p> | > > > > > > > > > > > > > > > > > | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | <h1 align="center">The Fossil Sync Protocol</h1> <p>Fossil supports commands <b>push</b>, <b>pull</b>, and <b>sync</b> for transferring information from one repository to another. The command is run on the client repository. A URL for the server repository is specified as part of the command. This document describes what happens behind the scenes in order to synchronize the information on the two repositories.</p> <h2>1.0 Overview</h2> <p>The global state of a fossil repository consists of an unordered collection of artifacts. Each artifact is identified by its SHA1 hash. Synchronization is simply the process of sharing artifacts between servers so that all servers have copies of all artifacts. Because artifacts are unordered, the order in which artifacts are received at a server is inconsequential. It is assumed that the SHA1 hashes of artifacts are unique - that every artifact has a different SHA1 hash. To first approximation, synchronization proceeds by sharing lists SHA1 hashes of available artifacts, then sharing those artifacts that are not found on one side or the other of the connection. In practice, a repository might contain millions of artifacts. The list of SHA1 hashes for this many artifacts can be large. So optimizations are employed that usually reduce the number of SHA1 hashes that need to be shared to a few hundred.</p> <h2>2.0 Transport</h2> <p>All communication between client and server is via HTTP requests. The server is listening for incoming HTTP requests. The client issues one or more HTTP requests and receives replies for each request.</p> <p>The server might be running as an independent server using the <b>server</b> command, or it might be launched from inetd or xinetd using the <b>http</b> command. Or the server might be launched from CGI. The details of how the server is configured to "listen" for incoming HTTP requests is immaterial. The important point is that the server is listening for requests and the client is the issuer of the requests.</p> <p>A single push, pull, or sync might involve multiple HTTP requests. The client maintains state between all requests. But on the server side, each request is independent. The server does not preserve any information about the client from one request to the next.</p> <h3>2.1 Server Identification</h3> <p>The server is identified by a URL argument that accompanies the push, pull, or sync command on the client. (As a convenience to users, the URL can be omitted on the client command and the same URL from the most recent push, pull, or sync will be reused. This saves typing in the common case where the client does multiple syncs to the same server.)</p> |
| ︙ | ︙ | |||
47 48 49 50 51 52 53 | <p>Then the URL that is really used to do the synchronization will be:</p> <blockquote> http://fossil-scm.hwaci.com/fossil/xfer </blockquote> | | | 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | <p>Then the URL that is really used to do the synchronization will be:</p> <blockquote> http://fossil-scm.hwaci.com/fossil/xfer </blockquote> <h3>2.2 HTTP Request Format</h3> <p>The client always sends a POST request to the server. The general format of the POST request is as follows:</p> <blockquote><pre> POST /fossil/xfer HTTP/1.0 Host: fossil-scm.hwaci.com:80 |
| ︙ | ︙ | |||
85 86 87 88 89 90 91 | <i>content...</i> </pre></blockquote> <p>The content type of the reply is always the same as the content type of the request.</p> | | | | | 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
<i>content...</i>
</pre></blockquote>
<p>The content type of the reply is always the same as the content type
of the request.</p>
<h2>3.0 Fossil Synchronization Content</h2>
<p>A synchronization request between a client and server consists of
one or more HTTP requests as described in the previous section. This
section details the "x-fossil" content type.</p>
<h3>3.1 Line-oriented Format</h3>
<p>The x-fossil content type consists of zero or more "cards". Cards
are separate by the newline character ("\n"). Leading and trailing
whitespace on a card is ignored. Blank cards are ignored.</p>
<p>Each card is divided into zero or more space separated tokens.
The first token on each card is the operator. Subsequent tokens
are arguments. The set of operators understood by servers is slightly
different from the operators understood by clients, though the two
are very similar.</p>
<h3>3.2 Login Cards</h3>
<p>Every message from client to server begins with one or more login
cards. Each login card has the following format:</p>
<blockquote>
<b>login</b> <i>userid nonce signature</i>
</blockquote>
|
| ︙ | ︙ | |||
129 130 131 132 133 134 135 | checks out, then the client is granted all privileges of the specified user.</p> <p>Privileges are cumulative. There can be multiple successful login cards. The session privileges are the bit-wise OR of the privileges of each individual login.</p> | | | 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | checks out, then the client is granted all privileges of the specified user.</p> <p>Privileges are cumulative. There can be multiple successful login cards. The session privileges are the bit-wise OR of the privileges of each individual login.</p> <h3>3.3 File Cards</h3> <p>Repository content records or files are transferred using a "file" card. File cards come in two different formats depending on whether the file is sent directly or as a delta from some other file.</p> <blockquote> |
| ︙ | ︙ | |||
163 164 165 166 167 168 169 | the UUID of another file that is the source of the delta.</p> <p>File cards are sent in both directions: client to server and server to client. A delta might be sent before the source of the delta, so both client and server should remember deltas and be able to apply them when their source arrives.</p> | | | 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 | the UUID of another file that is the source of the delta.</p> <p>File cards are sent in both directions: client to server and server to client. A delta might be sent before the source of the delta, so both client and server should remember deltas and be able to apply them when their source arrives.</p> <h3>3.4 Push and Pull Cards</h3> <p>Among of the first cards in a client-to-server message are the push and pull cards. The push card tell the server that the client is pushing content. The pull card tell the server that the client wants to pull content. In the event of a sync, both cards are sent. The format is as follows:</p> |
| ︙ | ︙ | |||
188 189 190 191 192 193 194 | The projectcode for the client and server must match in order for the transaction to proceed.</p> <p>The server will also send a push card back to the client during a clone. This is how the client determines what project code to put in the new repository it is constructing.</p> | | | | | | | | | 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 | The projectcode for the client and server must match in order for the transaction to proceed.</p> <p>The server will also send a push card back to the client during a clone. This is how the client determines what project code to put in the new repository it is constructing.</p> <h3>3.5 Clone Cards</h3> <p>A clone card works like a pull card in that it is sent from client to server in order to tell the server that the client wants to pull content. But unlike the pull card, the clone card has no arguments.</p> <blockquote> <b>clone</b> </blockquote> <p>In response to a clone message, the server also sends the client a push message so that the client can discover the projectcode for this project.</p> <h3>3.6 Igot Cards</h3> <p>An igot card can be sent from either client to server or from server to client in order to indicate that the sender holds a copy of a particular file. The format is:</p> <blockquote> <b>igot</b> <i>uuid</i> </blockquote> <p>The argument of the igot card is the UUID of the file that the sender possesses. The receiver of an igot card will typically check to see if it also holds the same file and if not it will request the file using a gimme card in either the reply or in the next message.</p> <h3>3.7 Gimme Cards</h3> <p>A gimme card is sent from either client to server or from server to client. The gimme card asks the receiver to send a particular file back to the sender. The format of a gimme card is this:</p> <blockquote> <b>gimme</b> <i>uuid</i> </blockquote> <p>The argument to the gimme card is the UUID of the file that the sender wants. The receiver will typically respond to a gimme card by sending a file card in its reply or in the next message.</p> <h3>3.8 Cookie Cards</h3> <p>A cookie card can be used by a server to record a small amount of state information on a client. The server sends a cookie to the client. The client sends the same cookie back to the server on its next request. The cookie card has a single argument which is its payload.</p> <blockquote> <b>cookie</b> <i>payload</i> </blockquote> <p>The client is not required to return the cookie to the server on its next request. Or the client might send a cookie from a different server on the next request. So the server must not depend on the cookie and the server must structure the cookie payload in such a way that it can tell if the cookie it sees is its own cookie or a cookie from another server. (Typically the server will embed its servercode as part of the cookie.)</p> <h3>3.9 Error Cards</h3> <p>If the server discovers anything wrong with a request, it generates an error card in its reply. When the client sees the error card, it displays an error message to the user and aborts the sync operation. An error card looks like this:</p> <blockquote> <b>error</b> <i>error-message</i> </blockquote> <p>The error message is English text that is encoded in order to be a single token. A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73). A newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E). A backslash (ASCII 0x5C) is represented as two backslashes "\\". Apart from space and newline, no other whitespace characters nor any unprintable characters are allowed in the error message.</p> <h3>3.10 Unknown Cards</h3> <p>If either the client or the server sees a card that is not described above, then it generates an error and aborts.</p> <h2>4.0 Phantoms And Clusters</h2> <p>When a repository knows that a file exists and knows the UUID of that file, but it does not know the file content, then it stores that file as a "phantom". A repository will typically create a phantom when it receives an igot card for a file that it does not hold or when it receives a file card that references a delta source that it does not hold. When a server is generating its reply or when a client is |
| ︙ | ︙ | |||
314 315 316 317 318 319 320 | <p>Any file that does not match the specifications of a cluster exactly is not a cluster. There must be no extra whitespace in the file. There must be one or more M cards. There must be a single Z card with a correct MD5 checksum. And all cards must be in strict lexicographical order.</p> | | | | | 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 | <p>Any file that does not match the specifications of a cluster exactly is not a cluster. There must be no extra whitespace in the file. There must be one or more M cards. There must be a single Z card with a correct MD5 checksum. And all cards must be in strict lexicographical order.</p> <h3>4.1 The Unclustered Table</h3> <p>Every repository maintains a table named "<b>unclustered</b>" which records the identity of every file and phantom it holds that is not mentioned in a cluster. The entries in the unclustered table can be thought of as leaves on a tree of files. Some of the unclustered files will be clusters. Those clusters may contain other clusters, which might contain still more clusters, and so forth. Beginning with the files in the unclustered table, one can follow the chain of clusters to find every file in the repository.</p> <h2>5.0 Synchronization Strategies</h2> <h3>5.1 Pull</h3> <p>A typical pull operation proceeds as shown below. Details of the actual implementation may very slightly but the gist of a pull is captured in the following steps:</p> <ol> <li>The client sends login and pull cards. |
| ︙ | ︙ | |||
379 380 381 382 383 384 385 | amount of overlap between clusters in the common configuration where there is a single server and many clients. The same synchronization protocol will continue to work even if there are multiple servers or if servers and clients sometimes change roles. The only negative effects of these unusual arrangements is that more than the minimum number of clusters might be generated.</p> | | | 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 | amount of overlap between clusters in the common configuration where there is a single server and many clients. The same synchronization protocol will continue to work even if there are multiple servers or if servers and clients sometimes change roles. The only negative effects of these unusual arrangements is that more than the minimum number of clusters might be generated.</p> <h3>5.2 Push</h3> <p>A typical push operation proceeds roughly as shown below. As with a pull, the actual implementation may vary slightly.</p> <ol> <li>The client sends login and push cards. <li>The client sends file cards for any files that it holds that have |
| ︙ | ︙ | |||
413 414 415 416 417 418 419 | <p>As with a pull, the steps of a push operation repeat until the server knows all files that exist on the client. Also, as with pull, the client attempts to keep the size of the request from growing too large by suppressing file cards once the size of the request reaches 1MB.</p> | | | | 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 |
<p>As with a pull, the steps of a push operation repeat until the
server knows all files that exist on the client. Also, as with
pull, the client attempts to keep the size of the request from
growing too large by suppressing file cards once the
size of the request reaches 1MB.</p>
<h3>5.3 Sync</h3>
<p>A sync is just a pull and a push that happen at the same time.
The first three steps of a pull are combined with the first five steps
of a push. Steps (4) through (7) of a pull are combined with steps
(5) through (8) of a push. And steps (8) through (10) of a pull
are combined with step (9) of a push.</p>
<h2>6.0 Summary</h2>
<p>Here are the key points of the synchronization protocol:</p>
<ol>
<li>The client sends one or more PUSH HTTP requests to the server.
The request and reply content type is "application/x-fossil".
<li>HTTP request content is compressed using zlib.
|
| ︙ | ︙ |