Artifact [94299d7020]
Not logged in

Artifact 94299d7020645c17906eadcda473475a34c0cb3001664f665894f3caa5aa699c:


Hashes: Fossil Artifact Identification

All artifacts in Fossil are identified by a unique hash, currently using the SHA3 algorithm by default, but historically using the SHA1 algorithm. Therefore, there are two full-length hash formats used by Fossil:

AlgorithmRaw BitsHexadecimal digits
SHA3-25625664
SHA116040

There are many types of artifacts in Fossil: commits (a.k.a. check-ins), tickets, ticket comments, wiki articles, forum postings, file data belonging to check-ins, etc. (More info...).

There is a loose hierarchy of terms used instead of “hash” in various parts of the Fossil UI, terms we try to use consistently, though we have not always succeeded. We cover each of those terms in the sections below.

Names

Several Fossil interfaces accept a wide variety of check-in names: commit artifact hashes, ISO8601 date strings, branch names, etc.

Artifact hashes are names, but not all names are artifact hashes. We use the broader term to refer to the whole class of options, and we use the specific terms when we mean one particular type of name.

Versions

When an artifact hash refers to a specific commit, Fossil sometimes calls it a “VERSION,” a “commit ID,” or a “check-in ID.” This is a specific type of artifact hash, distinct from, let us say, a wiki article artifact hash.

We may eventually settle on one of these terms, but all three are currently in common use within Fossil’s docs, UI, and programming interfaces.

A unique prefix of a VERSION hash is itself a VERSION. That is, if your repository has exactly one commit artifact with a hash prefix of “abc123”, then that is a valid version string as long as it remains unambiguous.

Unconventional Use Of The Term "UUID"

"UUID" is an acronym for "Univerially Unique Identifier". Hashes generated by SHA1 or SHA3-256 are universally unique (in practice, if not in theory) and they identify a particular artifact, and so it seems reasonable to refer to artifact hashes as UUIDs.

However, the term UUID has acquired a much stricter meaning than its name alone implies. Purists insist that UUIDs must be exactly 128 bits, that they must be displayed in a particular hexadecimal format that includes dashes at proscribed intervals, and that they must have four particular bits set aside to indicate the "type" of UUID. Fossil artifact hashes do not comply with any of these supplemental requirements, and so are not UUIDs in the strictest sense of the word. But the artifact hashes in Fossil are literally "univerally unique identifiers", and so they are sometimes called "UUIDs" anyhow.

Some readers are greatly annoyed by Fossil's use of "UUID" in its most literal sense. To those readers, the designer apologizes, and seeks your mercy by noting that when the term "UUID" first began to be used by Fossil, only SHA1 was supported and so all the artifact hashes were 128 bits, making them close to, if not exactly, in compliance with the rigid definition of the term. For his misuse of the term "UUID", the designer has been frequently rebuked. Some efforts have been made, over the ensuing years, to avoid and replace "UUID" in newer code and documentation. But it does not seem like such a serious issue as to require an immediate purge of the term from existing documentation, code, and database schemas, as some have suggested. Hence, the unconventional use of the term "UUID" lingers on in Fossil. Let new readers beware.

Places where the non-conforming use of "UUID" persists in Fossil are discussed in the sequel.

Repository DB Schema

Almost all remaining uses of the term "UUID" in Fossil derive from the blob.uuid table column. This is a key lookup column in the most important persistent Fossil DB table, so it influences broad swaths of the Fossil internals.

It is theoretically possible to rename this column and those it has influenced (e.g. purgeitem.uuid, shun.uuid, and ticket.tkt_uuid) by making Fossil detect the outdated schema and silently upgrade it, coincident with updating all of the SQL in Fossil that refers to these columns. But that is a large and error-prone edit that does serve any pressing need, and so is unlikely to happen any time soon. Hence, Fossil will likely continue to have “UUID” all through its internals.

In order to avoid needless terminology conflicts, Fossil code that refers to these columns also uses some variant of “UUID.” For example, C code that refers to SQL result data on blob.uuid usually calls the variable zUuid. Another example is the internal function uuid_to_rid(). Until and unless the columns are renamed, these associated function names will likely also go unchanged.

You may have local SQL code that digs into the repository DB using these column names. If so, be warned: we are not inclined to consider existence of such code sufficient reason to avoid renaming the columns. The Fossil repository DB schema is not considered an external user interface, and internal interfaces are subject to change at any time. We suggest switching to a more stable API: the JSON API, /timeline.rss, TH1, etc.

TH1 Scripting Interfaces

Some TH1 interfaces use “UUID” where they actually mean some kind of hash. For example, the $tkt_uuid variable, available via TH1 when customizing Fossil’s ticket system.

Because this is considered a public programming interface, we are unwilling to unilaterally rename such TH1 variables, even though they are "wrong". For now, we are simply documenting the unconventional terminology.

JSON API Parameters and Outputs

The JSON API frequently uses the term “UUID” in the same sort of way, most commonly in artifact and timeline APIs. As with the prior case, we can’t fix these without breaking code that uses the JSON API as originally designed, so our solutions are the same: document the unconventional usage.

manifest.uuid

If you have the manifest setting enabled, Fossil writes a file called manifest.uuid at the root of the check-out tree containing the commit hash for the current checked-out version. Because this is a public interface, we are unwilling to rename the file for correctness.