Diff
Not logged in

Differences From Artifact [5cff62ccf6]:

To Artifact [e3ece24fa5]:


1
2
3
4
5
6
7
8
9

















10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
<h1 align="center">The Fossil Sync Protocol</h1>

<p>Fossil supports commands <b>push</b>, <b>pull</b>, and <b>sync</b>
for transferring information from one repository to another.  The
command is run on the client repository.  A URL for the server repository
is specified as part of the command.  This document describes what happens
behind the scenes in order to synchronize the information on the two
repositories.</p>


















<h2>1.0 Transport</h2>

<p>All communication between client and server is via HTTP requests.
The server is listening for incoming HTTP requests.  The client
issues one or more HTTP requests and receives replies for each
request.</p>

<p>The server might be running as an independent server
using the <b>server</b> command, or it might be launched from
inetd or xinetd using the <b>http</b> command.  Or the server might
be launched from CGI.  The details of how the server is configured
to "listen" for incoming HTTP requests is immaterial.  The important
point is that the server is listening for requests and the client
is the issuer of the requests.</p>

<p>A single push, pull, or sync might involve multiple HTTP requests.
The client maintains state between all requests.  But on the server
side, each request is independent.  The server does not preserve
any information about the client from one request to the next.</p>

<h3>1.1 Server Identification</h3>

<p>The server is identified by a URL argument that accompanies the
push, pull, or sync command on the client.  (As a convenience to
users, the URL can be omitted on the client command and the same URL
from the most recent push, pull, or sync will be reused.  This saves
typing in the common case where the client does multiple syncs to
the same server.)</p>









>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
|



















|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
<h1 align="center">The Fossil Sync Protocol</h1>

<p>Fossil supports commands <b>push</b>, <b>pull</b>, and <b>sync</b>
for transferring information from one repository to another.  The
command is run on the client repository.  A URL for the server repository
is specified as part of the command.  This document describes what happens
behind the scenes in order to synchronize the information on the two
repositories.</p>

<h2>1.0 Overview</h2>

<p>The global state of a fossil repository consists of an unordered
collection of artifacts.  Each artifact is identified by its SHA1 hash.
Synchronization is simply the process of sharing artifacts between
servers so that all servers have copies of all artifacts.  Because
artifacts are unordered, the order in which artifacts are received
at a server is inconsequential.  It is assumed that the SHA1 hashes
of artifacts are unique - that every artifact has a different SHA1 hash.
To first approximation, synchronization proceeds by sharing lists 
SHA1 hashes of available artifacts, then sharing those artifacts that
are not found on one side or the other of the connection.  In practice,
a repository might contain millions of artifacts.  The list of
SHA1 hashes for this many artifacts can be large.  So optimizations are
employed that usually reduce the number of SHA1 hashes that need to be
shared to a few hundred.</p>

<h2>2.0 Transport</h2>

<p>All communication between client and server is via HTTP requests.
The server is listening for incoming HTTP requests.  The client
issues one or more HTTP requests and receives replies for each
request.</p>

<p>The server might be running as an independent server
using the <b>server</b> command, or it might be launched from
inetd or xinetd using the <b>http</b> command.  Or the server might
be launched from CGI.  The details of how the server is configured
to "listen" for incoming HTTP requests is immaterial.  The important
point is that the server is listening for requests and the client
is the issuer of the requests.</p>

<p>A single push, pull, or sync might involve multiple HTTP requests.
The client maintains state between all requests.  But on the server
side, each request is independent.  The server does not preserve
any information about the client from one request to the next.</p>

<h3>2.1 Server Identification</h3>

<p>The server is identified by a URL argument that accompanies the
push, pull, or sync command on the client.  (As a convenience to
users, the URL can be omitted on the client command and the same URL
from the most recent push, pull, or sync will be reused.  This saves
typing in the common case where the client does multiple syncs to
the same server.)</p>
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
<p>Then the URL that is really used to do the synchronization will
be:</p>

<blockquote>
http://fossil-scm.hwaci.com/fossil/xfer
</blockquote>

<h3>1.2 HTTP Request Format</h3>

<p>The client always sends a POST request to the server.  The
general format of the POST request is as follows:</p>

<blockquote><pre>
POST /fossil/xfer HTTP/1.0
Host: fossil-scm.hwaci.com:80







|







64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
<p>Then the URL that is really used to do the synchronization will
be:</p>

<blockquote>
http://fossil-scm.hwaci.com/fossil/xfer
</blockquote>

<h3>2.2 HTTP Request Format</h3>

<p>The client always sends a POST request to the server.  The
general format of the POST request is as follows:</p>

<blockquote><pre>
POST /fossil/xfer HTTP/1.0
Host: fossil-scm.hwaci.com:80
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117

<i>content...</i>
</pre></blockquote>

<p>The content type of the reply is always the same as the content type
of the request.</p>

<h2>2.0 Fossil Synchronization Content</h2>

<p>A synchronization request between a client and server consists of
one or more HTTP requests as described in the previous section.  This
section details the "x-fossil" content type.</p>

<h3>2.1 Line-oriented Format</h3>

<p>The x-fossil content type consists of zero or more "cards".  Cards
are separate by the newline character ("\n").  Leading and trailing
whitespace on a card is ignored.  Blank cards are ignored.</p>

<p>Each card is divided into zero or more space separated tokens.
The first token on each card is the operator.  Subsequent tokens
are arguments.  The set of operators understood by servers is slightly
different from the operators understood by clients, though the two
are very similar.</p>

<h3>2.2 Login Cards</h3>

<p>Every message from client to server begins with one or more login
cards.  Each login card has the following format:</p>

<blockquote>
<b>login</b>  <i>userid  nonce  signature</i>
</blockquote>







|





|











|







102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134

<i>content...</i>
</pre></blockquote>

<p>The content type of the reply is always the same as the content type
of the request.</p>

<h2>3.0 Fossil Synchronization Content</h2>

<p>A synchronization request between a client and server consists of
one or more HTTP requests as described in the previous section.  This
section details the "x-fossil" content type.</p>

<h3>3.1 Line-oriented Format</h3>

<p>The x-fossil content type consists of zero or more "cards".  Cards
are separate by the newline character ("\n").  Leading and trailing
whitespace on a card is ignored.  Blank cards are ignored.</p>

<p>Each card is divided into zero or more space separated tokens.
The first token on each card is the operator.  Subsequent tokens
are arguments.  The set of operators understood by servers is slightly
different from the operators understood by clients, though the two
are very similar.</p>

<h3>3.2 Login Cards</h3>

<p>Every message from client to server begins with one or more login
cards.  Each login card has the following format:</p>

<blockquote>
<b>login</b>  <i>userid  nonce  signature</i>
</blockquote>
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
checks out, then the client is granted all privileges of the
specified user.</p>

<p>Privileges are cumulative.  There can be multiple successful
login cards.  The session privileges are the bit-wise OR of the
privileges of each individual login.</p>

<h3>2.3 File Cards</h3>

<p>Repository content records or files are transferred using
a "file" card.  File cards come in two different formats depending
on whether the file is sent directly or as a delta from some
other file.</p>

<blockquote>







|







146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
checks out, then the client is granted all privileges of the
specified user.</p>

<p>Privileges are cumulative.  There can be multiple successful
login cards.  The session privileges are the bit-wise OR of the
privileges of each individual login.</p>

<h3>3.3 File Cards</h3>

<p>Repository content records or files are transferred using
a "file" card.  File cards come in two different formats depending
on whether the file is sent directly or as a delta from some
other file.</p>

<blockquote>
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
the UUID of another file that is the source of the delta.</p>

<p>File cards are sent in both directions: client to server and
server to client.  A delta might be sent before the source of
the delta, so both client and server should remember deltas
and be able to apply them when their source arrives.</p>

<h3>2.4 Push and Pull Cards</h3>

<p>Among of the first cards in a client-to-server message are
the push and pull cards.  The push card tell the server that
the client is pushing content.  The pull card tell the server
that the client wants to pull content.  In the event of a sync,
both cards are sent.  The format is as follows:</p>








|







180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
the UUID of another file that is the source of the delta.</p>

<p>File cards are sent in both directions: client to server and
server to client.  A delta might be sent before the source of
the delta, so both client and server should remember deltas
and be able to apply them when their source arrives.</p>

<h3>3.4 Push and Pull Cards</h3>

<p>Among of the first cards in a client-to-server message are
the push and pull cards.  The push card tell the server that
the client is pushing content.  The pull card tell the server
that the client wants to pull content.  In the event of a sync,
both cards are sent.  The format is as follows:</p>

188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
The projectcode for the client and server must match in order
for the transaction to proceed.</p>

<p>The server will also send a push card back to the client
during a clone.  This is how the client determines what project
code to put in the new repository it is constructing.</p>

<h3>2.5 Clone Cards</h3>

<p>A clone card works like a pull card in that it is sent from
client to server in order to tell the server that the client
wants to pull content.  But unlike the pull card, the clone
card has no arguments.</p>

<blockquote>
<b>clone</b>
</blockquote>

<p>In response to a clone message, the server also sends the client
a push message so that the client can discover the projectcode for
this project.</p>

<h3>2.6 Igot Cards</h3>

<p>An igot card can be sent from either client to server or from
server to client in order to indicate that the sender holds a copy
of a particular file.  The format is:</p>

<blockquote>
<b>igot</b> <i>uuid</i>
</blockquote>

<p>The argument of the igot card is the UUID of the file that
the sender possesses.
The receiver of an igot card will typically check to see if
it also holds the same file and if not it will request the file
using a gimme card in either the reply or in the next message.</p>

<h3>2.7 Gimme Cards</h3>

<p>A gimme card is sent from either client to server or from server
to client.  The gimme card asks the receiver to send a particular
file back to the sender.  The format of a gimme card is this:</p>

<blockquote>
<b>gimme</b> <i>uuid</i>
</blockquote>

<p>The argument to the gimme card is the UUID of the file that
the sender wants.  The receiver will typically respond to a
gimme card by sending a file card in its reply or in the next
message.</p>

<h3>2.8 Cookie Cards</h3>

<p>A cookie card can be used by a server to record a small amount
of state information on a client.  The server sends a cookie to the
client.  The client sends the same cookie back to the server on
its next request.  The cookie card has a single argument which
is its payload.</p>

<blockquote>
<b>cookie</b> <i>payload</i>
</blockquote>

<p>The client is not required to return the cookie to the server on
its next request.  Or the client might send a cookie from a different
server on the next request.  So the server must not depend on the
cookie and the server must structure the cookie payload in such
a way that it can tell if the cookie it sees is its own cookie or
a cookie from another server.  (Typically the server will embed
its servercode as part of the cookie.)</p>

<h3>2.9 Error Cards</h3>

<p>If the server discovers anything wrong with a request, it generates
an error card in its reply.  When the client sees the error card,
it displays an error message to the user and aborts the sync
operation.  An error card looks like this:</p>

<blockquote>
<b>error</b> <i>error-message</i>
</blockquote>

<p>The error message is English text that is encoded in order to
be a single token.
A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73).  A
newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E).  A backslash 
(ASCII 0x5C) is represented as two backslashes "\\".  Apart from
space and newline, no other whitespace characters nor any
unprintable characters are allowed in
the error message.</p>

<h3>2.10 Unknown Cards</h3>

<p>If either the client or the server sees a card that is not
described above, then it generates an error and aborts.</p>

<h2>3.0 Phantoms And Clusters</h2>

<p>When a repository knows that a file exists and knows the UUID of
that file, but it does not know the file content, then it stores that
file as a "phantom".  A repository will typically create a phantom when
it receives an igot card for a file that it does not hold or when it
receives a file card that references a delta source that it does not
hold.  When a server is generating its reply or when a client is







|














|















|














|



















|



















|




|







205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
The projectcode for the client and server must match in order
for the transaction to proceed.</p>

<p>The server will also send a push card back to the client
during a clone.  This is how the client determines what project
code to put in the new repository it is constructing.</p>

<h3>3.5 Clone Cards</h3>

<p>A clone card works like a pull card in that it is sent from
client to server in order to tell the server that the client
wants to pull content.  But unlike the pull card, the clone
card has no arguments.</p>

<blockquote>
<b>clone</b>
</blockquote>

<p>In response to a clone message, the server also sends the client
a push message so that the client can discover the projectcode for
this project.</p>

<h3>3.6 Igot Cards</h3>

<p>An igot card can be sent from either client to server or from
server to client in order to indicate that the sender holds a copy
of a particular file.  The format is:</p>

<blockquote>
<b>igot</b> <i>uuid</i>
</blockquote>

<p>The argument of the igot card is the UUID of the file that
the sender possesses.
The receiver of an igot card will typically check to see if
it also holds the same file and if not it will request the file
using a gimme card in either the reply or in the next message.</p>

<h3>3.7 Gimme Cards</h3>

<p>A gimme card is sent from either client to server or from server
to client.  The gimme card asks the receiver to send a particular
file back to the sender.  The format of a gimme card is this:</p>

<blockquote>
<b>gimme</b> <i>uuid</i>
</blockquote>

<p>The argument to the gimme card is the UUID of the file that
the sender wants.  The receiver will typically respond to a
gimme card by sending a file card in its reply or in the next
message.</p>

<h3>3.8 Cookie Cards</h3>

<p>A cookie card can be used by a server to record a small amount
of state information on a client.  The server sends a cookie to the
client.  The client sends the same cookie back to the server on
its next request.  The cookie card has a single argument which
is its payload.</p>

<blockquote>
<b>cookie</b> <i>payload</i>
</blockquote>

<p>The client is not required to return the cookie to the server on
its next request.  Or the client might send a cookie from a different
server on the next request.  So the server must not depend on the
cookie and the server must structure the cookie payload in such
a way that it can tell if the cookie it sees is its own cookie or
a cookie from another server.  (Typically the server will embed
its servercode as part of the cookie.)</p>

<h3>3.9 Error Cards</h3>

<p>If the server discovers anything wrong with a request, it generates
an error card in its reply.  When the client sees the error card,
it displays an error message to the user and aborts the sync
operation.  An error card looks like this:</p>

<blockquote>
<b>error</b> <i>error-message</i>
</blockquote>

<p>The error message is English text that is encoded in order to
be a single token.
A space (ASCII 0x20) is represented as "\s" (ASCII 0x5C, 0x73).  A
newline (ASCII 0x0a) is "\n" (ASCII 0x6C, x6E).  A backslash 
(ASCII 0x5C) is represented as two backslashes "\\".  Apart from
space and newline, no other whitespace characters nor any
unprintable characters are allowed in
the error message.</p>

<h3>3.10 Unknown Cards</h3>

<p>If either the client or the server sees a card that is not
described above, then it generates an error and aborts.</p>

<h2>4.0 Phantoms And Clusters</h2>

<p>When a repository knows that a file exists and knows the UUID of
that file, but it does not know the file content, then it stores that
file as a "phantom".  A repository will typically create a phantom when
it receives an igot card for a file that it does not hold or when it
receives a file card that references a delta source that it does not
hold.  When a server is generating its reply or when a client is
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341

<p>Any file that does not match the specifications of a cluster
exactly is not a cluster.  There must be no extra whitespace in
the file.  There must be one or more M cards.  There must be a
single Z card with a correct MD5 checksum.  And all cards must
be in strict lexicographical order.</p>

<h3>3.1 The Unclustered Table</h3>

<p>Every repository maintains a table named "<b>unclustered</b>"
which records the identity of every file and phantom it holds that is not
mentioned in a cluster.  The entries in the unclustered table can
be thought of as leaves on a tree of files.  Some of the unclustered
files will be clusters.  Those clusters may contain other clusters,
which might contain still more clusters, and so forth.  Beginning
with the files in the unclustered table, one can follow the chain
of clusters to find every file in the repository.</p>

<h2>4.0 Synchronization Strategies</h2>

<h3>4.1 Pull</h3>

<p>A typical pull operation proceeds as shown below.  Details
of the actual implementation may very slightly but the gist of
a pull is captured in the following steps:</p>

<ol>
<li>The client sends login and pull cards.







|










|

|







331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358

<p>Any file that does not match the specifications of a cluster
exactly is not a cluster.  There must be no extra whitespace in
the file.  There must be one or more M cards.  There must be a
single Z card with a correct MD5 checksum.  And all cards must
be in strict lexicographical order.</p>

<h3>4.1 The Unclustered Table</h3>

<p>Every repository maintains a table named "<b>unclustered</b>"
which records the identity of every file and phantom it holds that is not
mentioned in a cluster.  The entries in the unclustered table can
be thought of as leaves on a tree of files.  Some of the unclustered
files will be clusters.  Those clusters may contain other clusters,
which might contain still more clusters, and so forth.  Beginning
with the files in the unclustered table, one can follow the chain
of clusters to find every file in the repository.</p>

<h2>5.0 Synchronization Strategies</h2>

<h3>5.1 Pull</h3>

<p>A typical pull operation proceeds as shown below.  Details
of the actual implementation may very slightly but the gist of
a pull is captured in the following steps:</p>

<ol>
<li>The client sends login and pull cards.
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
amount of overlap between clusters in the common configuration where
there is a single server and many clients.  The same synchronization
protocol will continue to work even if there are multiple servers
or if servers and clients sometimes change roles.  The only negative
effects of these unusual arrangements is that more than the minimum
number of clusters might be generated.</p>

<h3>4.2 Push</h3>

<p>A typical push operation proceeds roughly as shown below.  As
with a pull, the actual implementation may vary slightly.</p>

<ol>
<li>The client sends login and push cards.
<li>The client sends file cards for any files that it holds that have







|







396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
amount of overlap between clusters in the common configuration where
there is a single server and many clients.  The same synchronization
protocol will continue to work even if there are multiple servers
or if servers and clients sometimes change roles.  The only negative
effects of these unusual arrangements is that more than the minimum
number of clusters might be generated.</p>

<h3>5.2 Push</h3>

<p>A typical push operation proceeds roughly as shown below.  As
with a pull, the actual implementation may vary slightly.</p>

<ol>
<li>The client sends login and push cards.
<li>The client sends file cards for any files that it holds that have
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435

<p>As with a pull, the steps of a push operation repeat until the
server knows all files that exist on the client.  Also, as with
pull, the client attempts to keep the size of the request from
growing too large by suppressing file cards once the
size of the request reaches 1MB.</p>

<h3>4.3 Sync</h3>

<p>A sync is just a pull and a push that happen at the same time.
The first three steps of a pull are combined with the first five steps
of a push.  Steps (4) through (7) of a pull are combined with steps
(5) through (8) of a push.  And steps (8) through (10) of a pull
are combined with step (9) of a push.</p>

<h2>5.0 Summary</h2>

<p>Here are the key points of the synchronization protocol:</p>

<ol>
<li>The client sends one or more PUSH HTTP requests to the server.
    The request and reply content type is "application/x-fossil".
<li>HTTP request content is compressed using zlib.







|







|







430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452

<p>As with a pull, the steps of a push operation repeat until the
server knows all files that exist on the client.  Also, as with
pull, the client attempts to keep the size of the request from
growing too large by suppressing file cards once the
size of the request reaches 1MB.</p>

<h3>5.3 Sync</h3>

<p>A sync is just a pull and a push that happen at the same time.
The first three steps of a pull are combined with the first five steps
of a push.  Steps (4) through (7) of a pull are combined with steps
(5) through (8) of a push.  And steps (8) through (10) of a pull
are combined with step (9) of a push.</p>

<h2>6.0 Summary</h2>

<p>Here are the key points of the synchronization protocol:</p>

<ol>
<li>The client sends one or more PUSH HTTP requests to the server.
    The request and reply content type is "application/x-fossil".
<li>HTTP request content is compressed using zlib.