Fossil: Artifact [8e375d2bcf]

Artifact 8e375d2bcfeadd9a2ec8ec20c41b8745ce4920f7b336acf67ec2ac104b97866a:

File www/fossil-v-git.wiki — part of check-in [d1c61803fc] at 2019-07-14 12:25:07 on branch bsd-vs-gpl — Small tweak to prev (user: wyoung size: 24186)
<title>Fossil Versus Git</title>

<h2>1.0 Don't Stress!</h2>

If you start out using one DVCS and later decide you like the other better,
you can easily [./inout.wiki | move your content].¹

Fossil and [http://git-scm.com | Git] are very similar in many respects,
but they also have important differences.
See the table below for
a high-level summary and the text that follows for more details.

Keep in mind that you are reading this on a Fossil website, and though
we try to be fair, the information here
might be biased in favor of Fossil.  Ask around for second opinions from
people who have used <em>both</em> Fossil and Git.

&#185;<small><i>Git does not include a
wiki, a ticket tracker, a forum, or a tech-note feature, so those elements will not transfer when
exporting from Fossil to Git. GitHub adds some of these to stock Git,
but because they're not part of Git proper, [./mirrortogithub.md|exporting a Fossil
repository to GitHub] will still not include them; Fossil tickets do not
become GitHub issues, for example.</i></small>

<h2>2.0 Differences Between Fossil And Git</h2>

Differences between Fossil and Git are summarized by the following table,
with further description in the text that follows.

<blockquote><table border=1 cellpadding=5 align=center>
<tr><th width="50%">GIT</th><th width="50%">FOSSIL</th></tr>
<tr><td>File versioning only</td>
    <td>Versioning, Tickets, Wiki, Technotes, Forum</td></tr>
<tr><td>Ad-hoc pile-of-files key/value database</td>
    <td>Relational SQL database</td></tr>
<tr><td>Bazaar-style development</td><td>Cathedral-style development</td></tr>
<tr><td>Designed for Linux kernel development</td>
    <td>Designed for SQLite development</td></tr>
<tr><td>Many contributors</td>
    <td>Select contributors</td></tr>
<tr><td>Focus on individual branches</td>
    <td>Focus on the entire tree of changes</td></tr>
<tr><td>Lots of little tools</td><td>Stand-alone executable</td></tr>
<tr><td>One check-out per repository</td>
    <td>Many check-outs per repository</td></tr>
<tr><td>Remembers what you should have done</td>
    <td>Remembers what you actually did</td></tr>
</table></blockquote>

<h3 id="features">2.1 Feature Set</h3>

Git provides file versioning services only, whereas Fossil adds
integrated [./wikitheory.wiki | wiki],
[./bugtheory.wiki | ticketing &amp; bug tracking],
[./embeddeddoc.wiki | embedded documentation], 
[./event.wiki | technical notes], and a [./forum.wiki | forum].
These additional capabilities are available for Git as 3rd-party
add-ons, but with Fossil they are integrated into
the design.  One way to describe Fossil is that it is
"[https://github.com/ | GitHub]-in-a-box."

If you clone [https://github.com/git/git|Git's self-hosting repository],
you get just Git's source code.
If you clone Fossil's self-hosting repository, you get the entire
Fossil website — source code, documentation, ticket history, and so forth.
That means you get a copy of this very article and all of its historical
versions, plus the same for all of the other public content on this site.

For developers who choose to self-host projects (rather than using a
3rd-party service such as GitHub) Fossil is much easier to set up, since
the stand-alone Fossil executable together with a [./server.wiki#cgi|2-line CGI script]
suffice to instantiate a full-featured developer website.  To accomplish
the same using Git requires locating, installing, configuring, integrating,
and managing a wide assortment of separate tools.  Standing up a developer
website using Fossil can be done in minutes, whereas doing the same using
Git requires hours or days.

<h3 id="database">2.2 Database</h3>

The baseline data structures for Fossil and Git are the same, modulo
formatting details.  Both systems store check-ins as immutable
objects referencing their immediate ancestors and named by a
cryptographic hash of the check-in content.

The difference is that Git stores its objects as individual files
in the ".git" folder or compressed into
bespoke "pack-files," whereas Fossil stores its objects in a
relational ([https://www.sqlite.org/|SQLite]) database file.  To put it
another way, Git uses an ad-hoc pile-of-files key/value database whereas
Fossil uses a proven, general-purpose SQL database.  This
difference is more than an implementation detail.  It
has important consequences.

With Git, one can easily locate the ancestors of a particular check-in
by following the pointers embedded in the check-in object, but it is
difficult to go the other direction and locate the descendants of a
check-in.  It is so difficult, in fact, that neither native Git nor
GitHub provide this capability.  With Git, if you are looking at some
historical check-in then you cannot ask
"What came next?" or "What are the children of this check-in?"

Fossil, on the other hand, parses essential information about check-ins
(parents, children, committers, comments, files changed, etc.)
into a relational database that can be easily
queried using concise SQL statements to find both ancestors and
descendents of a check-in.

Leaf check-ins in Git that lack a "ref" become "detached," making them
difficult to locate and subject to garbage collection.  This
"detached head" problem has caused untold grief for countless
Git users.  With Fossil, all check-ins are easily located using
a variety of attributes (parents, children, committer, date, full-text
search of the check-in comment) and so detached heads are simply not possible.

The ease with which check-ins can be located and queried in Fossil
has resulted in a huge variety of reports and status screens
([./webpage-ex.md|examples]) that show project state
in ways that help developers
maintain enhanced awareness and comprehension
and avoid errors.


<h3 id="vs-linux">2.3 Linux vs. SQLite</h3>

Fossil and Git promote different development styles because each one was
specifically designed to support the primary authors' main software
development project: [https://en.wikipedia.org/wiki/Linus_Torvalds|Linus
Torvalds] designed Git to support development of
[https://www.kernel.org/|the Linux kernel], and
[https://en.wikipedia.org/wiki/D._Richard_Hipp|D. Richard Hipp] designed
Fossil to support the development of [https://sqlite.org/|SQLite].
SQLite is much more widely deployed than the Linux kernel, but for
Linux-based systems, the kernel is the more fundamental component.
Both projects must rank high on any objective list of "most
important FOSS projects," yet these two projects are almost entirely unlike
one another.

In the following sections, we will explain how three key differences
between Linux and SQLite dictated the design of each DVCS's low-friction
usage path.

When deciding between these two DVCSes, you should ask yourself, "Is my
project more like Linux or more like SQLite?"


<h4 id="devorg">2.3.1 Development Organization</h4>

Eric S. Raymond's seminal essay-turned-book
"[https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar|The
Cathedral and the Bazaar]" details the two major development
organization styles found in
[https://en.wikipedia.org/wiki/Free_and_open-source_software|FOSS]
projects. As it happens, Linux and SQLite fall on opposite sides of this
dichotomy. Differing development organization styles dictate a different
design and low-friction usage path in the tools created to support each
project.

Git promotes the Linux kernel's bazaar development style, in which a
loosely-associated mass of developers contribute their work through
[https://git-scm.com/book/en/v2/Distributed-Git-Distributed-Workflows#_dictator_and_lieutenants_workflow|a
hierarchy of lieutenants] who manage and clean up these contributions
for consideration by Linus Torvalds, who has the power to cherrypick
individual contributions into his version of the Linux kernel. Git
allows an anonymous developer to rebase and push specific locally-named
private branches, so that a Git repo clone often isn't really a clone at
all: it may have an arbitrary number of differences relative to the
repository it originally cloned from. Git encourages siloed development.
Select work in a developer's local repository may remain private
indefinitely.

All of this is exactly what one wants when doing bazaar-style
development.

Fossil's normal mode of operation differs on every one of these points,
with the specific designed-in goal of promoting SQLite's cathedral
development model:

<ul>
    <li><p><b>Personal engagement:</b> SQLite's developers know each
    other by name and work together daily on the project.</p></li>

    <li><p><b>Trust over hierarchy:</b> Fossil supports developers given
    direct commit capability on the repository rather than support a
    hierarchical "dictator and lieutenants" contribution style.  D.
    Richard Hipp rarely overrides decisions made by those he has trusted
    with commit access on his repositories.
    [/doc/trunk/www/admin-v-setup.md|Some users] have more power over
    what they can do with the repository, but Fossil does not otherwise
    directly support the enforcement of a development organization's
    social hierarchy. Fossil is a great fit for
    [https://en.wikipedia.org/wiki/Flat_organization|flat
    organizations].</p></li>

    <li><p><b>Anonymous contribution discouraged:</b> Anonymous
    contribution is possible in a Fossil project, but there is no
    low-friction path to it, as in Git. Fossil's closest equivalent to
    Git pull requests is the [/help?cmd=bundle|bundle], which requires
    higher engagement than firing off a PR. Both Fossil and Git also
    support <tt>patch(1)</tt> files, but that's a lossy contribution
    path in both systems.</p></li>

    <li><p><b>No rebasing:</b> When a remote clone syncs changes up to
    its parent repository, the changes are sent exactly as they were
    committed to the local repository. [#history|There is no rebasing
    mechanism, on purpose.]</p></li>

    <li><p><b>Sync over push:</b> Explicit pushes are uncommon in
    Fossil-based projects; the default is to rely on
    [/help?cmd=autosync|autosync mode] instead, in which each commit
    normally syncs immediately to its parent repository, so that
    explicit pushes are not needed.</p></li>

    <li><p><b>Branch names sync:</b> Unlike in Git, branch names are not
    purely local labels. They sync along with everything else, so
    everyone everyone sees the same set of branch names.</p></li>

    <li><p><b>Private branches are rare:</b>
    [/doc/trunk/www/private.wiki|Private branches exist in Fossil], but
    they're normally used to handle rare exception cases, whereas in
    many Git projects, they're part of the straight-line development
    process.</p></li>

    <li><p><b>Identical clones:</b> Fossil's autosync system tries to
    keep local clones identical to the repository it cloned
    from.</p></li>
</ul>

Where Git encourages siloed development, Fossil fights against it.
[https://en.wikipedia.org/wiki/Jim_McCarthy_(author)|Jim McCarthy] put
it well in his book on software project management,
[https://www.amazon.com/dp/0735623198/|Dynamics of Software
Development]: "[https://www.youtube.com/watch?v=oY6BCHqEbyc|Beware of a
guy in a room]." Fossil places a lot of emphasis on synchronizing
everyone's work and on reporting on the state of the project and the
work of its developers, so that everyone — especially the project leader
— can maintain a better mental picture of what is happening, leading to
better situational awareness.

Each DVCS can be used in the opposite style, but doing so works against
their low-friction paths.


<h4 id="scale">2.3.2 Scale</h4>

The Linux kernel has a far bigger developer community than that of
SQLite: there are thousands and thousands of contributors to Linux, most
of whom do not know each others names. These thousands are responsible
for producing roughly 89⨉ more code than is in SQLite. (10.7
[https://en.wikipedia.org/wiki/Source_lines_of_code|MLOC] vs 0.12 MLOC
according to [https://dwheeler.com/sloccount/|SLOCCount].) The Linux
kernel and its development process were already uncommonly large back in
2005 when Git was designed, specifically to support the consequences of
having such a large set of developers working on such a large code base.

95% of the code in SQLite comes from just four programmers, and 64% of
it is from the lead developer alone. The SQLite developers know each
other well and interact daily. Fossil was designed for this development
model. As well, we think the fact of Fossil's birth a year later
than Git allowed it to learn from some of the key design mistakes in
Git.

We think you should ask yourself whether you have Linus Torvalds scale
software configuration management problems or D. Richard Hipp scale
problems when choosing your DVCS. An
[https://en.wikipedia.org/wiki/Impact_wrench|automotive air impact
wrench] running at 8000 RPM driving an M8 socket-cap bolt at 16 cm/s is
not the best way to hang a picture on the living room wall.


<h4 id="contrib">2.3.3 Accepting Contributions</h4>

Git is covered by
[https://en.wikipedia.org/wiki/GNU_General_Public_License#Version_2|the
GPLv2]. Fossil is covered by
[https://fossil-scm.org/fossil/file/COPYRIGHT-BSD2.txt|a two-clause BSD
style license]. Neither license affects the managed repository contents,
and it is not our purpose here to try to persuade you to make the same
choice of license that we did, but we believe the choice of license has
an effect on the actual use of each DVCS. If you don't want to read
about licensing, you can skip to [#whylicmat|our point at the end].

The GPL allows a project to do without a
[https://en.wikipedia.org/wiki/Contributor_License_Agreement|contributor
license agreement] (CLA) because by the very act of distributing
binaries produced from GPL'd source code, you are bound by the license
to also distribute that source code under a compatible license. Some
GPL-based projects do require a CLA, but usually only to further
commercial interests rather than to maintain the legal integrity of the
[https://en.wikipedia.org/wiki/Free_and_open-source_software|FOSS]
project itself.

Contrast a BSD-style project, where contributions are not automatically
relicensed merely by being distributed along with the preexisting BSD
code. Such projects often require a CLA even when there are no corporate
interests, to ensure that all contributions are compatibly licensed with
the existing body of code. It's a way to add a "no takebacks" clause to
the basic BSD license.

A CLA makes signing up new contributors harder. It's an extra
gatekeeping step, so it discourages low-engagement contributors. The
stock GPL requires some of the same relinquishment of rights as Fossil's
CLA, and the Git project adds to this
[https://github.com/git/git/blob/master/Documentation/SubmittingPatches#L306|an
implicit CLA], but contributors agree to both passively.
[http://fossil-scm.org/home/doc/trunk/www/contribute.wiki|The Fossil
project's contribution process] requires active steps and processing
time: the printing, signing, mailing, reception, and processing of the
CLA.

<a name="whylicmat"></a>
We think there's an upside to this difference in licensing, in Fossil's favor: it
improves contributor community cohesion, because everyone who pushed
past that legal friction made an affirmative, active step to get commit
capability on the Fossil project repository. We believe discouraging
[https://www.jonobacon.com/2012/07/25/building-strong-community-structural-integrity/|drive-by
contributions] makes for a better, more carefully-designed, simpler
DVCS.

It's so easy to add features to Git that its command interface has
become truly arcane. Masters of the arcane are able to do wizardly
things, but only by studying their art deeply for years. This does not
strike us as a good use of the user's time. We believe it's better to
have a simpler tool with a more easily internalized behavior set, which
you can use quickly then set aside in order to get back to your main
task of producing the content that you manage in the DVCS. We achieve
that by carefully choosing which users to give commit bits to, and which
of their feature branches get merged down to trunk.


<h3 id="branches">2.4 Individual Branches vs. The Entire Change History</h3>

Both Fossil and Git store history as a directed acyclic graph (DAG)
of changes, but Git tends to focus more on individual branches of
the DAG, whereas Fossil puts more emphasis on the entire DAG.

For example, the default "sync" behavior in Git is to only sync
a single branch, whereas with Fossil the only sync option it to
sync the entire DAG.  Git commands,
GitHub, and GitLab tend to show only a single branch at
a time, whereas Fossil usually shows all parallel branches at
once.  Git has commands like "rebase" that help keep all relevant
changes on a single branch, whereas Fossil encourages a style of
many concurrent branches constantly springing into existance,
undergoing active development in parallel for a few days or weeks, then
merging back into the main line and disappearing.

This difference in emphasis arises from the different purposes of
the two systems.  Git focuses on individual branches, because that
is exactly what you want for a highly-distributed bazaar-style project
such as Linux.  Linus Torvalds does not want to see every check-in
by every contributor to Linux, as such extreme visibility does not scale
well.  But Fossil was written for the cathedral-style SQLite project
with just a handful of active committers.  Seeing all
changes on all branches all at once helps keep the whole team
up-to-date with what everybody else is doing, resulting in a more 
tightly focused and cohesive implementation.


<h3 id="executables">2.5 Lots of little tools vs. Self-contained system</h3>

Git consists of many small tools, each doing one small part of the job,
which can be recombined (by experts) to perform powerful operations.
Git has a lot of complexity and many dependencies and requires an "installer"
script or program to get it running.

Fossil is a single self-contained stand-alone executable with hardly
any dependencies.  Fossil can be (and often is) run inside a
minimally configured chroot jail.  To install Fossil,
one merely puts the executable somewhere in the $PATH.

The designer of Git says that the Unix philosophy is to have lots of
small tools that collaborate to get the job done.  The designer of
Fossil says that the Unix philosophy is "It just works."  Both
individuals have written their DVCSes to reflect their own view
of the "Unix philosophy."


<h3 id="checkouts">2.6 One vs. Many Check-outs per Repository</h3>

A "repository" in Git is a pile-of-files in the ".git" subdirectory
of a single check-out.  The check-out and the repository are located
together in the filesystem.

With Fossil, a "repository" is a single SQLite database file
that can be stored anywhere.  There
can be multiple active check-outs from the same repository, perhaps
open on different branches or on different snapshots of the same branch.
Long-running tests or builds can be running in one check-out while
changes are being committed in another.

Git version 2.5 adds a feature to emulate Fossil's decoupling of the
repository from the check-out tree, which it calls
"[https://git-scm.com/docs/git-worktree|git-worktree]." This command
sets up a series of links in the filesystem to
allow a single repository to host multiple check-outs.  However,
the interface is sufficiently difficult to use that most people
find it easier to create a separate clone for each check-out.
There are also practical consequences of the way it's implemented
that make worktrees not quite equivalent to the main Git repo + checkout
tree.

With Fossil, the complete decoupling of repository and check-out tree
means every working check-out tree is treated equally. It's common in
Fossil to have a check-out tree for each major working branch so that
you can switch branches with a "cd" command rather than replace the
current working file set with a different file set by updating in place,
as Git prefers.


<h3 id="history">2.7 What you should have done vs. What you actually did</h3>

Git puts a lot of emphasis on maintaining
a "clean" check-in history.  Extraneous and experimental branches by
individual developers often never make it into the main repository.  And
branches are often rebased before being pushed, to make
it appear as if development had been linear.  Git strives to record what
the development of a project should have looked like had there been no
mistakes.

Fossil, in contrast, puts more emphasis on recording exactly what happened,
including all of the messy errors, dead-ends, experimental branches, and
so forth.  One might argue that this
makes the history of a Fossil project "messy."  But another point of view
is that this makes the history "accurate."  In actual practice, the
superior reporting tools available in Fossil mean that the added "mess"
is not a factor.

One commentator has mused that Git records history according to
the victors, whereas Fossil records history as it actually happened.


<h2 id="missing">3.0 Missing Features</h2>

Most of the capabilities found in Git are also available in Fossil and
the other way around. For example, both systems have local check-outs,
remote repositories, push/pull/sync, bisect capabilities, and a "stash."
Both systems store project history as a directed acyclic graph (DAG)
of immutable check-in objects.

But there are a few capabilities in one system that are missing from the
other.


<h3 id="missing-in-git">3.1 Features found in Fossil but missing from Git</h3>

  *  <b>The ability to show descendents of a check-in.</b>

   Both Git and Fossil can easily find the ancestors of a check-in.  But
   only Fossil shows the descendents.  (It is possible to find the
   descendents of a check-in in Git using the log, but that is sufficiently
   difficult that nobody ever actually does it.)

  *  <b>Wiki, Embedded documentation, Trouble-tickets, Tech-Notes, and Forum</b>

   Git only provides versioning of source code.  Fossil strives to provide
   other related project management services as well.

  *  <b>Named branches</b>

   Branches in Fossil have persistent names that are propagated
   to collaborators via [/help?cmd=push|push] and [/help?cmd=pull|pull].
   All developers see the same name on the same branch.  Git, in contrast,
   uses only local branch names, so developers working on the
   same project can (and frequently do) use a different name for the
   same branch.

  *  <b>The [/help?cmd=all|fossil all] command</b>

   Fossil keeps track of all repositories and check-outs and allows
   operations over all of them with a single command.  For example, in
   Fossil is possible to request a pull of all repositories on a laptop
   from their respective servers, prior to taking the laptop off network.
   Or it is possible to do "fossil all changes" to see if there are any
   uncommitted changes that were overlooked prior to the end of the workday.

  *  <b>The [/help?cmd=ui|fossil ui] command</b>

   Fossil supports an integrated web interface.  Some of the same features
   are available using third-party add-ons for Git, but they do not provide
   nearly as many features and they are not nearly as convenient to use.


<h3 id="missing-in-fossil">3.2 Features found in Git but missing from Fossil</h3>

  *  <b>Rebase</b>

   Because of its emphasis on recording history exactly as it happened,
   rather than as we would have liked it to happen, Fossil deliberately
   does not provide a "rebase" command.  One can rebase manually in Fossil,
   with sufficient perserverence, but it is not something that can be done with
   a single command.

  *  <b>Push or pull a single branch</b>

   The [/help?cmd=push|fossil push], [/help?cmd=pull|fossil pull], and
   [/help?cmd=sync|fossil sync] commands do not provide the capability to
   push or pull individual branches.  Pushing and pulling in Fossil is
   all or nothing.  This is in keeping with Fossil's emphasis on maintaining
   a complete record and on sharing everything between all developers.