Fossil-NG
Not logged in

Status

This is a discussion format and record of ideas for "Fossil: The Next Generation" or Fossil-NG. There is currently no code in place for Fossil-NG.

Broadly, this document collects ideas that have come up since Fossil was created in 2006 which are difficult enough to build atop the current design that solving them might involve a breakage in compatibility with prior Fossil versions. Much as the hash policy feature forced the change from Fossil 1.x to Fossil 2.0, implementing some of the following ideas might force a similarly epochal shift to "Fossil 3.0" or "Fossil 4.0".

Goals

In evaluating the following ideas, it is important to keep in mind that the primary and most important purpose of Fossil is to support the development of the SQLite project; it has succeeded fabulously in that respect. Any changes implemented for Fossil-NG must respect that history. In particular, Fossil-NG must be able to import the complete SQLite history intact, without any changes to hash identifiers.

Partial Clones and Syncs

Fossil is currently limited to complete clones and complete syncs. This has several drawbacks in real-world workflows which we can solve by adding support for:

  • Shallow Clones - the ability to clone just the recent history of a project.

    This allows efficient working clones of very large repositories: you almost certainly don't need all 25 years of the past history of a long-running project on your local machine, and if you do, you almost certainly don't need all of it 5 minutes after the initial clone. It should be possible to clone only, say, the past 30 days on all open branches initially, then have other artifacts pulled from the central repository only as necessary.

  • Narrow Clones - the ability to clone a subset of a project, selected by subdirectory.

    There are two primary use cases:

    First, you might have a simple single-project repository with subdirectories like doc/, src/, test/, etc., but you might only need, say, the src/ subdirectory for a given operation. There's no point taking up disk space in the checkout directory with subdirectories holding files you have no intention of ever using.

    Second, Fossil currently works best with the repo-per-project version control strategy. Fossil's speed drops continually as the size of the repository grows, in large part due to lookup times in the SQLite repo DB file being roughly proportional to log(N), where N is the number of artifacts, checkins, files, etc. This is true of most other DVCSes, but there remain good arguments for a monorepo. Narrow clones allow you to have a monorepo but work with only the subset of it that you actually need in a given case.

  • Selective Sync - the ability to push/pull/sync single branches.

    The current closest alternative to this are private branches, but those have a number of workflow drawbacks.

Better Git Integration

Although Fossil is nearly as old as Git and several other popular DVCSes, Git has emerged as the dominant DVCS in part due to market forces, rather than on purely technical merit. (See Fossil vs. Git.)

This is unfortunate, because of the various version control systems available today, Git is the least user-friendly.

Nevertheless, Fossil must have some level of accommodation with this reality.

Fossil 2.x currently has several features in this line, primarily the longstanding fossil import --git command and the much newer fossil git export command. (The latter replaces the old fossil export --git command, deprecated in Fossil 2.9.)

There are still some missing pieces, however. We'd like to be able to:

  • Round-Trip with Git: A GitHub export of a Fossil repository is effectively a read-only clone: if someone sends you a PR against your GitHub export, and you accept it on GitHub, that work won't then propagate back up into the Fossil project. To maintain a sane version history, all checkins to the project must go through Fossil, not through Git.

    In the same way that fossil export --git was replaced with the more powerful fossil git export command, we'd like a fossil git import command that would sync changes to a Git repo back into a Fossil repo, allowing full round-tripping between the two, within the limitations of the various formats.

    (We word it that way to exclude things outside the Git file format; we aren't going to try to make GitHub issues sync as Fossil tickets and vice versa, for example.)

  • Fossil as Git Front-End: It would be nice to be able to use the superior fossil command line and web UI experiences with preexisting Git repos without doing a fossil import --git conversion. We'd like all of the following to work:

      $ fossil clone https://github.com/me/my-repo.git
      $ cd my-repo
      $ fossil ui &
      ...hack on files in Git repo...
      $ fossil ci       # wrapper for "git commit -a && git push"
    

    We expand on this topic below.

  • Git as Fossil Back-End: If we do both of the above, then we're just a few steps away from using the Git repository format as a Fossil storage option from the start. That is, instead of starting work with Git and then converting to Fossil, you could say something like:

      $ fossil init --git repo
    

    ...to create a Git packfile-formatted repository directory. You'd lose Fossil features like the ticket tracker, the wiki, the web forum, etc., of course.

  • Git as Fossil Front-End: Although the git command is famously much more difficult to use than fossil, there are practical advantages to allowing this, such as integration with existing automated tooling. For instance, it would help Fossil's adoption if you could use a text editor's Git integration features to work with a Fossil repository, rather than require that it be reimplemented as a Fossil-specific integration. Another common case is CI/CD systems, which all know how to pull Git repos but rarely have built-in capabilities to deal with Fossil repos.

VCS as File Container

One way to think about a VCS is as a enhanced file container. You can use file container formats like tar and zip to create a single snapshot of a project. A VCS is the same except that it allows one to store multiple snapshots.

Fossil-NG should have facilities that allow you to work with Fossil repositories in as simple a manner as traditional command-line file archiving tools like tar and zip. For example:

fossil new new-repo.fossil Makefile README.md src/

That command will create a file container named "new-repo.fossil" that contains the files Makefile, README.md, and all files under src/. This repository would be compact, and suitable for sending as an email attachment, just like one would do with a tarball or ZIP archive.

The SQLite Archiver project shows how to build a file container using SQLite for storage. Fossil-NG should include the capabilities of SQLite Archiver, though the underlying file format will doubtless be more advanced so that it can also support block-chain versioning.

Current Fossil provides web-page links and command-line operations for generating a ZIP archive or tarball for a particular check-in. Fossil-NG should continue to have that capability, but should also be able to generate a single-checkin repository that serves as a file container.

Support For Multiple VCS Formats

Fossil-NG should provide the ability to store artifacts in various formats:

  1. Git
  2. Mercurial
  3. Legacy Fossil
  4. Fossil-NG

This is a generalization of the Git-specific cases above: in principle, Fossil is not tied to a single file format. It is possible to encode a Fossil repository atop other substrates other than the current SQLite-based DB file format.

A repository that contains only Git-formatted artifacts should have the capability to interoperate seamlessly with Git repositories. This allows, for example, a Fossil-NG user to clone and use a repository out of GitHub while continuing to use the superior Fossil user interface. It also allows a Fossil server to answer clone/push/pull requests from legacy Git clients. The same symmetry applies to Mercurial and legacy Fossil.

Thus, by using a single Fossil-NG client and user interface, a developer can interact with legacy repositories in a variety of formats, without having to learn the idiosyncrasies of multiple VCSes.

Fossil-NG itself will be able to operate on repositories that hold a mixture of artifacts in various formats. Legacy Git, Hg, and Fossil clients, however will only be able to interoperate with Fossil-NG repositories that hosts all artifacts in the one format the the client understands.

Converting artifacts from one format to another is an expensive computation - too expensive to do on-the-fly. However, it is conceivable that Fossil-NG could be used to convert repositories between Git/Hg/Fossil formats as an off-line operation.