Fossil

Signing and verification of artifacts
Login

This document tries to bring closer a more ubiquitous, seamless and useful signing and verification of artifacts.

This is a draft!
It is incomplete. It sketches out a few possible solutions. These solutions try to balance flexibility and complexity.

Table of content:

Agenda and context

The main point is to enable strong authenticity not just for "data in transit" but also for "data at rest". This would enable some interesting features:

The above would make Fossil robust distributed system that by design can not be surpassed by any "server-based" service (e.g. GitHub and the likes).

The idea is not new. Monotone (which is kind of ancestor of Fossil) automatically signs every commit. But Monotone seems to be orphaned and does not support all the goodies provided by Fossil (like customizable WebUI and Tickets, Wiki, Forum and so on).
The optimal way to implement the feature in Fossil is not obvious, so lengthy discussions about the details can easily be anticipated.

Some related topics have already arisen at the Forum:

Some more recent noteworthy opinions on the topic:

ravbc on 2020-10-22:

IMHO, there is no easy escape from distributing public keys within a repository

offray on 2020-10-23:

I really like the idea of having public keys uploaded to the repository and signed by others in it.

wyoung on 2020-12-13:

All you can do is establish a PKI standard within the set of repos you do control.

wyoung on 2021-09-18

It is quite unlikely that your Fossil server has a wild assortment of PGP keys

george on 2022-05-29

Fossil 2.19 should accept structural artifacts with signatures in some prominent (yet undecided) format...

Identity model

Identity is a cryptographically sound avatar of a human being. Identity is distinguished by the public key of it's main keypair1, which is referred to either directly (for signature schemes with short public keys, such as Ed25519) or through it's hashsum (for signature schemes with long public keys, such as Ed448 and RSA). In both cases a human-friendly variant of base32 encoding2 is used in order to prevent confusion with artifacts' UUIDs and also to facilitate verbal transfers (in the context of signing parties and alike).

Identity does not expire, but can be explicitly abrogated. Identity's main key may be used to claim that it was compromised or intentionally destroyed.3 Also identity's main key may be used to declare a trusted revoker4 — a public key that is authorized to claim that identity's main key is lost, destroyed or compromised. The former claim may be recovered using the identity's main key, while in the later two cases the whole identity is permanently abrogated. A trusted revoker may be a key that is under exclusive control of identity's owner or may be a main key of some other identity. In both cases authorization of the trusted revoker may have an expiration time set and also be limited to just some of claims (for example, only "lost" and "destroyed" claims may be authorized). A trusted revoker need not be public unless it is used.

A set of projects that are relevant for a particular identity will be denoted as identity's context.

Identity's context is partitioned into workspaces. This means that a workspace is a subset of projects relevant for that identity, and that at any moment of time any two workspaces do not overlap. However projects may be added to or removed from workspaces as time goes. Identity's context may constitute of just a single workspace. Similarly a workspace may consist of just a single project.

A person may have just one identity or may choose to maintain several identities (perhaps with different organization of workspaces). It is advised to have as little amount of identities as is reasonable5; this model tries to be sufficiently flexible in order to permit that.

Workspace subkeys are used for general-purpose signing of structural artifacts (check-ins, posts, ticket changes, wiki edits etc.). Each workspace subkey is limited in scope to a particular workspace and must be neither used nor propagated outside of that workspace. Each workspace subkey has an expiration date set.6 A workspace subkey may be used to revoke itself.
In the following a "subkey" or "work key" means a short form of "workspace subkey".

A particular identity at any moment of time may have just one active workspace subkey within any workspace. In the other words: several workspace subkeys of a particular identity must not be used simultaneously within any project. If the aforementioned clash is observed then identity should be treated as misbehaving and suspicious.7

It is assumed that the safety of the main key is maintained on a higher level than the safety of the workspace subkeys; and that safety of trusted revoker(s) (if any) is somewhere in between.

Auxiliary definitions

Trust model

The system should try to answer the ultimate inquiry from a user:

Is this particular signed artifact a legitimate one or counterfeited?

The answer to this question is guaranteed to be "it is legitimate" if and only if at the moment when that signed artifact was created

The reality is more complicated because often there is a bit of uncertainty. The system should derive a probabilistic answer based on the estimates of probabilities for the values of the above predicates.

That calculation might use the reasoning about possible temporal11 sequence of events and also the claims from identities within relevant project(s).

Propositions within binary claims fall into one of three categories:

  1. Connectedness
    This is quantified as ERL (short for expected response lag) which estimates the typical duration of information roundtrips12.
    It sums up durations that are needed for

    • workspace's new information to reach claim's destination,
    • destination to understand this information and prepare a response,
    • that response to reach substantial part of workspace's participants.
  2. Integrity
    Encapsulates safety of a particular workspace subkey and also willingness of the destination's owner to revoke or rotate a subkey immediately upon the discovery of key compromise.

    This is quantified as a transient probability that a signed artifact is legitimate. It is a tripple of scalars, where each scalar estimates aforementioned probability for a certain moment:

    • right after a signed artifact has been received,
    • a moment that is two ERLs later13,
    • a moment that is five ERLs after an artifact was received13

    If the corresponding workspace subkey is revoked then all these probabilities are invalidated.

  3. Trustworthiness
    Estimate of trust that source puts into pairwise claims signed by the destination.

    This is quantified as an integer in the range [-3;+3] which represents a bias of a claim's destination relative to its source on the abstract axis "trustworthiness". This abstract axis encapsulates and integrates three very different characteristics of a human being:

    • safety
      — ability to prevent counterfeiting of claims (through a leak of the main key in particular);
        this aggregates

      • severity of threats
      • willingness to resist
      • resources for defense (such as skills, laws, money, etc.)
    • perspicacity
      — ability to deduce the truth; about other identities in particular.

    • honesty:
      — intolerance to the falsity of one's own propositions; one's own claims in particular.

    The integer values of 0, ±1, ±2 and ±3 may be interpreted as "same", "slightly", "noticeably" and "much" respectively.

A claim with proposition about trustworthiness will be referred to as t-claim. T-claim is propagated to all projects that are relevant for both the source and the destination. T-claims form a global "social graph".

A claim with propositions about connectedness and integrity will be referred to as ci-claim. CI-claim is propagated to all projects that

For a given signed artifact it is possible to estimate its legitimacy provided that "social graph" contains a path from the identity who makes an inquiry to the identity who signed that artifact.
Probability that a signed artifact is legitimate may be computed for arbitrary moment of time as weighted average of approximated integrities from the available ci-claims.
The aforementioned weights are derived from the t-claims using a computation over the underlying "social graph". This computation starts from the identity who makes an inquiry and computes weights of other identities in a BFS-like14 manner, until the author of the artifact is reached.

Footnotes


  1. ^ This is essential. It enables the same identity to participate in different projects, even though the owner of that identity previously was registered and is participating in these projects under different UserIDs. See also forum post ae37ac84285.
  2. ^ Something like Crockford's Base32 encoding.
  3. ^ This is a bit speculative because the signing of the "intentionally destroyed" claim has to precede the actual destruction of the last copy of a secret key; and that actual destruction may fail silently.
  4. ^ It's yet unclear which word is more appropriate: "trusted" or "designated".
  5. ^ An underline conjecture is that it should help to improve the connectedness of the global Web of Trust.
  6. ^ Whenever a workspace subkey is introduced, prolongated or rotated there is an upper bound for the eligible lifespan. The exact optimal lifespan depends on the workspace. A lifespan of 14 months is suggested as a hard-coded maximum.
  7. ^ It may be tempting to allow several simultaneous workspace subkeys within a project. In that case different devices could use a dedicated workspace subkey. Thus if a leak of the corresponding secret key should occur then it would be possible to identify (and fix) the device that permitted that leak. However, it looks like a significant complication of the model, which for the time being seems neither necessary nor desirable.
  8. ^ It's unclear which word is more appropriate: "binary", "pairwise" or some other.
  9. ^ It's unclear which word is more appropriate: "destination", "target" or some other.
  10. ^ This special case of a binary claim may be viewed as a claim about trustworthiness.
  11. ^ The notion of "when" is rather complicated for a distributed system without a single source of trusted timestamps. The only thing that can be guaranteed is that the knowledge of the output of a secure hash function can not precede the knowledge of the corresponding input.
  12. ^ The notion of "roundtrip" is blurry if there is no central server. In that case it is more about dissipation of information in "both directions" through the network of retransmitters (not all of which are necessarily participants of a project).
  13. ^ a b The exact values of that delay is debatable. It is assumed that two ERLs might be enough for the destination to react on impersonation, and five ERLs might be enough for reaction from a trusted revoker or other participants of the workspace. If the delay is modeled by Erlang-2 distribution, then two ERLs give 91% probability that response has been received.
  14. ^ Breadth-first search. Proceeds like an expanding concentric wave on the water.