# A12
A12 is a remote network protocol for interactive, realtime multimedia
applications. It is has been designed as the network equivalent of the local
display server API and IPC system, [SHMIF], used by [ARCAN]. To achieve this it
adds extensions for supporting confidentiality, integrity, discovery and
adaptive compression.
This document provides an informal introduction for implementing and using the
protocol and document its security considerations, along with an overview of
the existing tools and support libraries that exist to leverage it today.
## Table Of Contents
1. [Introduction](#introduction)
2. [Dependencies](#dependencies)
3. [Authentication and Cryptography](#authentication-and-cryptography)
4. [Commands](#commands)
5. [Streaming Transfers](#streaming-transfers)
2. [Video](#video)
3. [Audio](#audio)
4. [Binary](#binary)
5. [Text](#text)
6. [Event Model](#event-model)
1. [Input](#input)
2. [Target Commands](#target)
3. [External Hints](#external-hints)
7. [Example Flow and Lifecycle](#flow)
8. [Directory Extension](#directory-extension)
2. [File Transfer](#directory-file)
3. [FAP Format](#directory-fap)
9. [Discovery Extension](#discovery-extension)
10. [Security](#security)
11. [Tools and Reference Implementations](#tools-and-reference-implementations)
12. [Future Changes](#future-changes)
13. [Acknowledgements](#acknowledgements)
14. [References](#references)
# Introduction
There are few protocols around for 'remote desktop' like applications, and none
that covers the needs of modern desktop cooperating with mobile devices while
taking the long legacy of misaligned features into account.
Instead of new protocols surfacing, there is a long list of proprietary
extensions to existing protocols like [RFC6143] and [RFC4254]. Other options
like [SPICE] see little development and has drifted behind into specialised
niches such as Virtual Machine monitors, or are complex to implement [MSRDP]
correctly.
The A12 protocol described in this document, seeks to remedy that situation.
# Dependencies
Many of the primitives used are from other established algorithms and protocols.
The dependencies that MUST be present are:
[x25519] for public key cryptography. [CHACHA20] as stream cipher. [BLAKE3]
used for MAC construction, key derivation, hashing for cache management and
integrity. [ZSTD] for generic compression.
It is also RECOMMENDED that [H264] is present for video compression, but an
implementation MUST provide work-arounds for their absence.
All integral types are in network byte order.
<a name="authentication-and-encryption">
# Authentication and Cryptography
</a>
Every packet except for the first has the same outer frame format:
|--------------------------|
| 16 octet MAC |
|--------------------------| ---
| u64 sequence number | |
| 1 byte type | | encrypted
|--------------------------| |
| [ type dependent block ] | |
|--------------------------| ---
The first packet has the MAC truncated to 8 octets, and an 8 octet
cryptographically secure pseudorandom number used as a nonce for key
derivation.
There are 5 possible packet types, covered in their respective sections:
Control Command (1) in [Section 4, Commands](#commands), Event (2) in [Section
6, Event Model](#event-model), Video Data (3), Audio Data (4), Binary Data (5)
in [Section 5](#streaming-transfers).
## Key Derivation
The key derivation used for the authentication packet is as follows:
kA = H(message = 'arcan-a12 init-packet', passphrase, nonce)
kMac = H(kA)
kCl = H(kMac)
kSrv = H(kCl)
It is a [HKDF] style scheme, but using [BLAKE3] in KDF mode.
Unless another passphrase has been agreed upon, it MUST be set to the default
'SETECASTRONOMY'. In controlled environments where there is a pre-existing
secure communication channel, the passphrase can be swapped to a limited use
one as needed.
When using a 3rd party rendezvous to establish a connection between a source
and a sink, the 3rd party will generate one in order for the two to
authenticate the public keys used. See the 'Directory extension' section for
this.
kMac is used to calculate the MAC for each packet according to:
MAC = H(previous_MAC | packet_octets)
with packet\_octets starting after the MAC field of the packet and continuing
through the length of the packet.
The reduced 8-round variant of [CHACHA20] is used as per the recommendations in
[TOOMUCHCRYPTO]. This is done in order to allow a lower tier of hardware
without acceleration to still get reasonable throughput.
Each side initiates the ChaCha8 state machine using the tuple {kCl, nonce} for
the client end, and the server end for {kSrv, nonce}.
The reason this setup is used before initiating actual key exchange and
derivation according to x25519 is to ensure that there is no reliable
fingerprint in the initial packet exchange, as well as for enabling passphrase
preauthentication of unknown public keys.
Before a connection is completely authenticated, the only packet type MUST be
'Control' (=1). It is the only packet type accepted in a preauthenticated state
and only the HELLO command is permitted. See the control section for details on
its general structure.
The first command sent, first from client to server with a matching reply from
server to client is 'HELLO' (=0). The fields used are as follows:
version-major: u8
version-minor: u8
mode: u8
kpub: u8[32]
role: u8
petname: u8[16]
The version fields SHOULD be pegged to the corresponding version of the
arcan-shmif build, if present, to assist with debugging the other end. In the
final release of this protocol the version will be set to 1 major 0 minor and
incremented according to [SEMVER], should any critical change be necessary
after the fact.
The 'mode' field specifies the authentication mode desired, and MUST be one of
the following:
0. no-exchange. Keep using the current derived keys for all communication.
This is NOT RECOMMENDED unless mandated by the legal environment.
1. x25519-direct. Start x25519 exchange using the provided kpub.
The other end will respond in kind.
2. x25519-ephemeral. The public key provided is a temporary one.
The purpose of the 25519-ephemeral mode is to establish a more secure channel
before transmitting the actual public keys in order to force an aggressor to
actively perform a man in the middle attack to harvest the actual public keys
for tracking and correlation across sessions.
In that mode, both sides will treat the other ephemeral key as known, then
transition the mode to x25519 (=1) and repeat the HELLO command, this time with
the real public keys.
The 'role' can be set to either:
1 = Source
2 = Sink
3 = Probe
4 = Directory
5 = Directory-reference
Directory and Directory-reference is specifically used for "Directory
Extension" mode covered in [Section 8](#directory-extension). Connecting a
Source to a Source or a Sink to a Sink MUST be prohibited and result in
connection shutdown.
The difficulty field is ignored unless the remote role is "Directory". Refer to
the [Section 8](#directory-extension) for more details on authentication
challenge difficulty.
Probe is used to indicate that there is no intention in performing any data
exchange after the authentication handshake. Its purpose is to be used to
determine the role and availability of the node at the other end as part of
checking the state of the mapping between a local keystore to known addresses.
When the public keys have been authenticated on each side, the key derivation
process is repated again using the established X25519 shared secret:
kA = H(message = 'arcan-a12 init-packet', shared_secret, nonce)
With encode, decode and MAC keys derived as covered previously.
## Rekeying
The REKEY command is only valid after a successfully authenticated connection.
It has the following fields:
mode: u8
mode=0 (ratchet):
kpub: u8[32]
challenge: u8
mode=1 (sigupd):
kpub: u8[32]
ksig: u8[64]
mode=2 (chgreply):
counter: u32
For mode=0 (ratchet) The server end holds 'rekeying' ownership. The current
owner may, at any time, issue a REKEY command. This transfers ownership of the
REKEY commmand over to the other end.
To do this, the owner first generates a new ephemeral X25519 keypair and passes
the new public key as payload to the REKEY command together with a nonce in the
nonce part of the command header. It sends this packets then rotate keys for
outbound use.
The new shared secret is calculated using the new private key together with the
last known public key of the other endpoint. The outbound cipher and HMAC state
is reset to this together with the nonce attached to the command packet.
The new MAC key is taken from H(message = 'arcan-a12 rekey', shared secret).
It is RECOMMENDED that the server performs the initial REKEY early, and that
further passing of the REKEY back and forth is latched to some trigger, e.g.
after a certain number of bytes of cipherstream has been consumed.
If the server initiated REKEY has a challenge value greater than 0 the client
SHOULD reply with a REKEY mode=2 or the server MAY terminate the connection.
To calculate the counter field for this reply, the client must produce a
counter such that H(COMMAND-nonce, counter) has, at least, (challenge) number
of leading unset bits. This is cheap for the server to verify but expensive for
the client to calculate as per [HASHCASH]. This serves the purpose of
penalising the client for issuing too many requests, such as an aggressive
scraper connected to a public facing directory server.
After a REKEY, old keymaterial MUST be discarded safely. The RECOMMENDED way to
do this is to generate the new keymaterial and hasher/cipher state into the
same memory that the old material consumed.
Mode=1 (sigupd) MUST only ever be used by the client end and ONLY when the
other end is ROLE_DIR. It is used to prove metadata ownership for indexing and
binary annotation. The kpub is the key present in the metadata to be updated or
retrieved. The ksig is an Ed25519 signature of H(hello-nonce | control-nonce).
Hello-nonce is the nonce provided by the server in the HELLO reply when the
connection was authenticated.
Control-nonce is the nonce provided by the client in the REKEY control command.
This is to prevent a malicious server from keeping a REKEY command and later
connecting to another server it knows that the client reuses and the server can
authenticate to, then replaying the REKEY command to impersonate that client.
While this wouldn't permit it to update data, but would grant it permission to
do index retrieval and search based on signing key.
<a name="commands">
# Commands
</a>
Every control packet type has a fixed size of 128 octets, with any extra octets
padded with noise or zero.
The fields of a control packet are as follows:
-----------------------
last-seen : u64
nonce : u8[8]
ch-id : u8
command : u8
-----------------------
command-specific data
-----------------------
Last-seen provides the sequence number of the latest seen packet from the other
end, or zero if no packet has yet to be received. The drift window (last-sent -
last-seen) SHOULD inform encoding heuristics and latency compensation.
Nonce is 8 octets of cryptographically secure pseudorandom numbers.
Channel-id is set to the active channel that the command applies to, which will
be zero unless aditional channels have been negotiated. The command value and
command specific fields whill be covered in the remainder of this section.
If the channel referenced by ID is invalid and refer to a previously closed
channel, the command should be discarded and processing continue as normal.
## Command 0: HELLO
This command was covered in [Section 3](#authentication-and-cryptography).
## Command 1: SHUTDOWN
last-words : u8[32]
This terminates a channel with an optional short message describing the reason
for termination, if any.
## Command 2 : DEFINE-CHANNEL
id : u8
type : u8
direction : u8
This creates another communication channel. Every channel can be a recipient of
commands, and may contain between zero and three ongoing data streams: one for
video, one for audio and one for binary. Audio and Video are unidirectional and
direction established on channel allocation.
Channel allocation SHOULD be paired to- and triggered- by secondary events from
user interaction and, while possible, not expected to be called arbitrarily.
Because of these secondary events (see [Section 6](#event-model) there is no
provision for collision avoidance in channel allocation, should both sides
decide to define the same channel identifier within a collision window.
It is RECOMMENDED to split the ID namespace such that source uses odd number
identifiers and sink uses even number ones but it is merely as a precaution.
The type value serves as a hint about the intended use in the local windowing
system. It is covered in the 'REGISTER' event part of [Target
Commands](#target) the event model.
## Command 3 : STREAM-CANCEL
id : u32
reason : u8
type : u8
This cancels an ongoing stream on the channel. Id carries the identifier
provided in the corresponding DEFINE-A,-B-,V-STREAM command. Reason can be:
0 - Undesired
The sink is no-longer interested in the contents of the stream and the source
MUST stop sending over the channel as soon as this command is received.
1 - Unhandled Format
The sink is not capable to decoding stream contents due to an incompatibility
with the encoding scheme present. This can happen at any point during decoding.
The source SHOULD attempt to re-open the stream with a more compatible codec,
even if this might be raw pixel streams deltas compressed with the REQUIRED
Zstd compression option.
2 - Already Known
The source already has the contents of the stream available locally. This is a
possible outcome for certain binary transfers of assets that can persist across
connections, such as files used for text typeface.
## Command 4 : DEFINE-VIDEO-STREAM
This command is described in [Section 5.2, Video](#video).
## Command 5 : DEFINE-AUDIO-STREAM
This command is described in [Section 5.3, Audio](#audio).
## Command 6 : DEFINE-BINARY-STREAM
This command is described in [Section 5.4, Binary](#binary).
## Command 7 : PING
stream-id : u32
This command can be used by either source sink for a channel and it is
RECOMMENDED that it is sent periodically both as connection keep-alive and to
assist each side with congestion window size tracking. The stream ID field
reference the last known completed stream, if any.
## Command 8 : REKEY
This command is described in [Section 3, Authentication](#authentication-and-encryption).
## Command 9 .. 14 : DIRECTORY EXTENSION
These command numbers are reserved for the directory extension. Their values
and use are described in [Section 8](#directory-extension).
<a name="streaming-transfers">
# Streaming Transfers
</a>
As covered in COMMAND 2, DEFINE CHANNEL - each channel is a container for one
unidirectional audio stream, one unidirection video stream and one
bidirectional binary stream. To initiate a stream on a channel, the apropriate
end issues a corresponding DEFINE-VIDEO-STREAM, DEFINE-AUDIO-STREAM or
DEFINE-BINARY-STREAM commands, followed by interleaving data packets of the
same type.
The data packets (3 VSTREAM-DATA, 4 ASTREAM-DATA, 5 BSTREAM-DATA) all have
the same header fields:
channel-id : u8
stream-id : u32
length : u16
followed by 'length' variable number of continous bytes to expect. It is the
full header+variable data block that is used to calculate and verify the
message authentication code as per [3. Authentication and
Encryption](#authentication-and-encryption).
The implementation may implement a number of strategies for chunking and
interleaving a stream, informed by current congestion window size, abstract
window type and event flow.
<a name="video">
## Video
</a>
A video frame transfer is initiated with a DEFINE-VIDEO-STREAM command (4),
followed by a number of vstream data packets (packet type 3). It is recommended
that those data packets are interleaved with other ongoing stream and command
transfers, with priority given to the channel with most recent user interaction
and activity focus through the VIEWPORT event (see [Section 6](#event-model)).
The fields of the 'DEFINE-VIDEO-STREAM' command are as follows:
id : u32
format : u8
surface width : u16
surface height : u16
x : u16
y : u16
frame width : u16
frame height : u16
flags : u8
compressed size : u32
uncompressed size : u32
commit : u8
four-cc : u8[4]
ID is a source defined identifier. It is local to the channel the stream is
being defined on and MUST not collide with other streams defineed on the same
channel. It is RECOMMENDED that this is tracked locally per channel and
incremented each time a stream is defined. An implementation MUST NOT permit
multiple streams of the same type in flight without being explicitly cancelled.
A single stream can be used to convey a number of image frames, and only need
to be redefined if the dimensions of the backing store change.
Format defines the encoding method for the data being sent:
0 : 32-bit, R8G8B8A8 with linear alpha.
1 : 24-bit, linear full-opqaue R8G8B8
2 : 16-bit, linear R5G6B5
5 : H264 stream
7 : ZSTD compressed TPACK block
8 : ZSTD compressed full frame
9 : ZSTD compressed delta frame
10 : Passthrough stream
The 3,4,6 format values are deprecated but kept allocated to retain
compatibility with dated implementations still using them. It is RECOMMENDED
that any encountered unhandled format value triggers a STREAM-CANCEL command
with unhandled format (1) as reason for cancellation.
The 'TPACK' format is described in [Section 5.5, Text](#text).
If format is set to passthrough (10) the four-cc field SHOULD contain the
fourCC encoded identifier of the encoder type, if known. This is used to permit
an opaque bitstream link with hardware encoders where the protocol
implementation might lack access to specifics due to hardware, security and
architectural segmentation.
All region and surface dimensions are in upper-left origo buffer order.
These are further modified by the 'flags' bitmap of possible processing hints:
1 : origo-lower-left
This bit is set if the decompressed buffer has an inverted row order and should
be flipped later in the processing pipeline.
If the format is of the known raw (0,1,2) or compressed raw(8,9) types the x,
y, frame width and frame height fields specify the affected region of the
defined surface. Multiple updates can be sent in sequence and changes
accumulate at the receiving sink end. Updates MUST NOT be passed on locally
until a stream with the 'commit' field is set to a non-zero value.
An implementation MUST calculate the uncompressed size based on the format and
surface dimensions and compare the calculated uncompressed size against the
value presented in the received size before allocating any decompression buffer
space and reject by issuing a CANCEL-STREAM command if the calculated value
does not match the received one.
<a name="audio">
## Audio
</a>
An audio frame transfer is initiated with a DEFINE-AUDIO-STREAM command (5),
followed by a number of astream data packets (packet type 4).
The fields of the 'DEFINE-AUDIO-STREAM' command are as follows:
id : u32
channels : u8
encoding : u8
nsamples : u16
rate : u32
The following encodings are supported:
signed 16-bit (0)
<a name="binary">
## Binary
</a>
A binary 'blob' transfer is initiated with a DEFINE-BINARY-STREAM command (6),
followed by a number of bstream data packets (packet type 5).
The fields of the 'DEFINE-BINARY-STREAM' command are as follows:
stream-id : u32
size : u64
type : u8
token-id : u32
checksum : u8[16]
compressed : u8
Stream identifier shares namespace with audio and video streams. It MUST be
unique. It is SUGGESTED that they are allocated through a shared incremental
counter.
The size field covers how many bytes that should be transferred in total,
or 0 if the stream is continuous. For that case completion and progress
notification is conveyed over the STREAMSTATUS event.
The type MUST be one of the following:
state (0) event trigger: STATE-IN, STATE-OUT
bchunk (1) event trigger: BCHUNKSTATE, BCHUNK-IN, BCHUNK-OUT
font (2) event trigger: FONTHINT
font-secondary (3) event trigger: FONTHINT
debug (4),
appl (5), appl-controller (6) (See DIRECTORY extension)
metadata (7) event trigger: BCHUNKSTATE
The token ID is a custom identifier used to pair the ongoing stream with queued
event with the outer desktop.
The checksum, if known, should use BLAKE3 in unkeyed hash mode. Its purpose is
for the other end to check for a locally cached version, and issue a
CANCEL-STREAM command if a matching one exists.
The metadata type can only be attached to something with a pre-existing bchunk
or state store. It is used to provide additional indexing annotation,
authentication signature, precomputed hash and transfer hints (e.g.
compression).
It uses the same encoding scheme as [Section 8.3, FAP Format](#fap-format) with
the additional restriction that if the metadata is signed, the first entries
MUST be:
ksig=base64(public-key-for-signature)
sign=base64(signature of kpub over hashed metadata)
The metadata MUST also contain a hash=base64(BLAKE3 hash of the associated
file). The following keys are reserved:
compression = precompressed | compress | uncompressed
extension = 3-4 letter file type identifier
mime = RFC6838 conforming type identifier
visibility = private | public
Any other key/value pairs are implementation defined. If a directory server
permits updating metadata which already carries a signature it MUST verify that
the new metadata is signed with the same kpub as the previous signature, or
that the sender has authenticated the old kpub through the REKEY command.
The later case can happen when the signer wants to rotate signing keys.
<a name="text">
## Text
</a>
A channel can be used to provide formatted text as a special encoded 'TPACK'
video stream. These are always compressed with ZSTD and values encoded in
little- endian.
Each frame starts with a 16 octet frame header:
data-size : u32
line-count : u16
cell-count : u16
scroll-direction : u8
frame-flags : u16
background-colour: u8[4]
cursor-state : u8
Each line contains:
start-line : u16
cell-count : u16
cell-offset : u16
content-dir : u8 ?!
scroll-dir : u8 ?!
line-state : u8
Followed by cell-count of cells:
* 3 bytes front_color
* 3 bytes back_color
* 2 byte attribute bitmap
* 4 bytes glyph-index or ucs4 code
* attribute bits (byte 0)
* bit 0: bold
* bit 1: underline
* bit 2: underwave
* bit 3: italic
* bit 4: strikethrough
* bit 5: cursor
* bit 6: shape break (re-align to grid)
* bit 7: skip-bit (double-width)
* (byte 1)
* bit 0: glyph-index
* bit 1: glyph-index-alt-font
* bit 2: border-right
* bit 3: border-down
* bit 4: border-left
* bit 5: border-top
* bit 6: treat color as palette reference (first byte of front_color)
<a name="event-model">
# Event Model
</a>
Event type packets have a fixed 128 byte size. The categories and types are a
filtered subset of those present in [SHMIF](SHMIF). Naming and numbering
conventions are kept to match with existing consumers of SHMIF. This is
intended to provide an easier path for integrating with local applications and
windowing system.
Place where there are gaps in the command numbering is where there exist a
locally reasonable use but in conflict with the networked case.
Each packet has a 1 byte category selector:
category : u8
The PERMITTED category values are:
input-device : 2
target-command : 16
external-hint : 64
An implementation MUST block/warn, discard/warn or terminate if a value from a
non-permitted category is found as this suggests a routing or filtering issue
with other users of SHMIF.
<a name="input">
## Input
</a>
Event category 2 is used for input events. This is most commonly provided when
a user is interacting with a window that has been provided over a channel.
These have a frame format of:
input-kind : u32
device-kind : u32
datatype : u32
label : u8[16]
flags : u8
device-id : u16
device-subid : u16
segment-token : u32
sample-ts : u64
Input kind and Device- kind are hints as to device and sampling origin, with
datatype specifying layout of remaining bytes in event packet.
Input kind MUST be one of the following values:
button : 0
axis-motion : 1
touch : 2
status : 3
eyes : 4
Device kind MUST be one of the following values:
keyboard : 1
mouse : 2
game-controller : 4
touch-display : 8
led-controller : 16
eyetracker : 32
status : 64
These are laid out as a bitmask both for internal routing uses, and the
INPUTMASK events that can be used to disable forwarding of several device
categories.
The label is a custom, short, ASCII encoded tag. This is used to pair with
LABELHINT events sent by the source in order to convey suggested binding and to
allow outer windowing system to reliably rebind or reroute.
Flags is a bitmap used to indicate if the event sample is associated with input
access or routing entering (& 2), leaving (& 4) a surface active state or
gesture (& 1).
Device ID is a source-local non-unique identifier to distinguish between one
device or another, and subid for devices with multiple associated input
sources.
The segment token is normally set to zero, but can be used to reference a
segment bound on some channel when manually rerouting, forwarding or
synthesizing input events.
The sample-ts timestamp is a monotonic clock in microseconds updated when the
sample was generated, for comparison against previous samples from the same
channel.
If the datatype is specified as ANALOG (=1):
relative : u8
count : u8
samples : d16[4]
Relative defines if the values provided in samples are relative to their
previously defined value (starting at 0), count how many (MUST be larger than
zero and less than- or equal to- 4).
If the datatype is specified as DIGITAL (=2):
active : u8
If active is set to 1, means that the button is being held and 0 if it has been
released.
If the datatype is specified as TRANSLATED (=4):
codepoint : u8[5]
active : u8
scancode : u8
symbol : u32
modifers : u16
Codepoint refers to a single, 0 terminated UTF-8 encoded unicode codepoint, or
zero if there is no available translation for the event.
If active set to 1, the translated input has been activated (rising) or released
(falling).
The scancode is a device-local reference for the button input which triggered
the event and SHOULD be considered a last resort for case by case
compatibility.
The symbol is a segment type relative lookup table index. It is RECOMMENDED
that the default table used is that of [SDL2] due to the range of platforms it
has been verified against. It is SUGGESTED that for Segment types e.g WAYLAND
and X11, the `<X11/keysymdef.h>` table is used as per [XLIBREF].
Modifiers is a bitmask, with the following bit allocation:
LEFT-SHIFT : 1
RIGHT-SHIFT : 2
LEFT-CONTROL : 3
RIGHT-CONTROL : 4
LEFT-ALT : 5
RIGHT-ALT : 6
LEFT-META : 7
RIGHT-META : 8
NUMLOCK : 9
CAPSLOCK : 10
MODE : 11
REPEAT : 12
The REPEAT modifier indicates that the event is an oscillating input and the
timestamp/congestion state SHOULD be considered before forwarding in order to
avoid accidental oscillations due to network conditions.
If the datatype is specified as TOUCH (=8):
active : u8
x, y : d16
pressure, size : f32
tilt-x, tilt-y : d16
tool : u8
If the datatype is specified as EYES (=16):
head position : f32[3]
head angle : f32[3]
gaze-region : f32[4]
user-present : u8
<a name="target">
## Target
</a>
Target command events authoritate instructions flowing from sink to source.
Their numbering and allocations have evolved organically, with gaps in event
value caused by deprecation or being masked due to poor translation from a
local to network processing model. Values not present in this set MUST
transition the connection to a terminal state.
Most of these require little intervention on the protocol level, but are
expected to have a meaningful translation to the local windowing system.
EXIT (1)
The exit event means that the channel will be severed. No further event
processing will be considered in either direction. This SHOULD result in a
COMMAND-CLOSE on the channel.
FRAMESKIP (2)
framecount : s32
The frameskip event means that only every 'framecount' frames should be sent.
This is useful for fast-forward stepping through contents. This can be
implemented either at the protocol layer or in the local windowing system, IF
it supports such a feature.
RESET (9)
level : s32
The RESET event means that the internal state of the source should change
due to a request from a user or an error in the local windowing system. The
level MUST be one of:
0 : Soft
1 : Hard
2 : Recovery
3 : Reconnect
Soft means that content and application state should return content to as close
to initial state as possible. Hard extends Soft to also include renegotiation
of additional resources such as fonts. Recorvery extends Hard with the
annotation that any and all previously accumulated state has been lost.
Reconnect extends Hard, but content preferences may also be different as the
backing connection may have migrated to another windowing environment.
PAUSE (10)
The PAUSE event means that no events other than RESET, UNPASE or EXIT MUST be
ignored or discarded.
UNPAUSE (11)
The UNPAUSE event cancels out the restrictions from a previous PAUSE event.
SEEKTIME (12)
mode : s32
timestamp : f32
The SEEKTIME event indicates that if the data source has a seekable notion of
temporal dependent content, it SHOULD seek to as close as the desired time as
possible.
The mode MUST be one of:
0 : Relative
1 : Absolute
Relative timestamp value is relative to the current content position and the
value is in discrete monotonic ticks on some local reference clock.
Absolute position is a floating point percentage in the 0..1 range with 0
0 meaning the start of the stream, and 1 the end of the stream.
SEEKCONTENT (13)
mode : s32
mode = 0 (relative)
dx : s32
dy : s32
dz : s32
mode = 1 (absolute)
x : f32
y : f32
z : f32
The SEEKCONTENT event MAY be used if the source has previously issued a
CONTENTHINT event indicating that there is spatial content which do not fit the
current window dimensions.
The absolute coordinate defines the upper left corner as a 0..1 encoded
percentage of the current window dimensions.
DISPLAYHINT (14)
width : s32
height : s32
hint : s32
layout : s32
density : f32
cell-width : d32
cell-height : d32
token : d64
The DISPLAYHINT event indicates to the source which dimensions it will be
presented at. If these differ from the ones that the source has defined in its
video stream, this means that the contents MAY be scaled to fit.
The tuple [cell-width, cell-height] are feedback to TPACK encoded channels
about the nominal cell dimensions based on the currently active font and
desired text size.
STREAMSET (16)
identifier : d32
The STREAMSET event MAY be sent to a source that has previously notified that
there are alternate data streams for viewing the content through a STREAMINFO
event. The provided identifier SHOULD be a value in the 0..n range provided in
that event.
ATTENUATE (17)
gain : f32
The ATTENUATE event MAY be sent to a source to request that the input gain on
any audio presented on the channel SHOULD be lowered to the gain value (within
0..1 range) before being passed as ASTREAM packets.
REQFAIL (20)
cookie : u32
The REQFAIL event MUST be passed in response to a BCHUNKSTATE or SEGREQ command
that could not be fulfilled due to constraints in the local windowing system.
GRAPHMODE (23)
group: u32
color: u8[3]
The GRAPHMODE event is used to communicate preferred colors used to prepare
VSTREAM transfers, depending on constraints passed in the local windowing system.
If the 8th bit is not set, it refers to the foreground colour. If the 8th bit
is set for group, and the group value permits separate BACKGROUND/ FOREGROUND
colours, the event refers to the BACKGROUND colour of the group.
The permitted group values are:
PRIMARY(2) : base colour (FOREGROUND, REFERENCE)
SECONDARY(3) : alternate colour, contrast to PRIMARY (FOREGROUND, REFERENCE)
BACKGROUND(4) : background colour, (BACKGROUND, REFERENCE)
TEXT(5) : default content text (FOREGROUND, BACKGROUND)
CURSOR(6) : input caret colour (FOREGROUND)
ALTCURSOR(7) : input caret colour in locked/modal state (FOREGROUND)
HIGHLIGHT(8) : text marked for user attention (FOREGROUND, BACKGROUND)
LABEL(9) : text used for UI elements (FOREGROUND, BACKGROUND)
WARNING(10) : text used to alert the user to a moderately severe problem
(BACKGROUND, FOREGROUND)
ERROR(11) : text used to alert the user to a severe problem
(BACKGROUND, FOREGROUND)
ALERT(12) : text used to alert the user towards immediate attention
(BACKGROUND, FOREGROUND)
REFERENCE(13) : links to files or Internet URLs
(BACKGROUND, FOREGROUND)
INACTIVE(14) : text used for UI elements that cannot be accessed
(BACKGROUND, FOREGROUND)
UI(15) : text used for generic UI elements
The values from 16 to 31 are used for a reference palette matching the display
attributes from VT100 descending terminals, in ascending order:
BLACK, RED, GREEN, YELLOW, BLUE, MAGENTA, CYAN, LIGHT GREY, DARK GREY, LIGHT
RED, LIGHT GREEN, LIGHT YELLOW, LIGHT BLUE, LIGHT MAGENTA, LIGHT CYAN.
MESSAGE (24)
message : u8[78]
The message event SHOULD be used sparringly for domain specific workarounds,
as well as short-form content on CLIPBOARD channel types.
The message field MAY be padded with NUL bytes but MUST NOT exceed the fixed
length. For longer binary transfers, the BINARY-STREAM command and BINARY
packets SHOULD be used in response to BCHUNKSTATE commands.
FONTHINT (25)
size-mm : f32
hint : u32
continuation : u32
FONTHINT is used to suggest desired properties of source text rasterization.
It is combined with DISPLAYHINT in order to resolve to font-local formats.
If size is set to 0, the size is unchanged from previous values or some
implementation defined default.
If there is an accompanying Font file in a format supported by [RFC8081] it
should be transmitted through a BINARY-STREAM command and BINARY packets
directly following the event.
The continuation field is set to (1) if the font transfer should append as
fallback to glyphs not present in previous transfers.
GEOHINT (26)
latitude : f32
longitude : f32
elevation : f32
country : u8[4]
spoken-lang : u8[4]
written-lang : u8[4]
The GEOHINT event is used to suggest parameters for supporting localisation and
positioning, for sources which can adapt to such features.
The values for the country field follow ISO-3166-1 alpha-3 with NUL byte
termination. The values for the spoken-lang and written-lang follow ISO-639-2
alpha-3 with NUL byte termination.
OUTPUTHINT (27)
max-width : u32
max-height : u32
vertical-refresh : u32
min-width : u32
identifier : u32
variable-min : f32
variable-step : f32
The OUTPUT hint event is used to provide details about the physical displays
that the segment source is mapped to and SHOULD be provided prior to the
DISPLAYHINT which covers how the segment is presented.
ACTIVATE (28)
The ACTIVATE is provided to terminate the set of events provided in the initial
burst when a new channel has been mapped.
ANCHORHINT (30)
relative-x : s32
relative-y : s32
relative-z : s32
source : s64
parent : s64
namespace : u32
The ANCHORHINT event is used to relay information about positioning in local
sink windowing system. The relative position values are to some global anchor
if parent is not referenced.
The source field is set to a segment token if the event relays information
about other windows that the target channel has a pre-established relationship
to.
If 'namespace' is set to 1, the source and parent fields reference source-
provided identifiers instead of sink provided segment identifiers.
<a name="external">
## External
</a>
External events are descriptive events from source to sink. They MAY affect
behaviour on sink processing, but any actions are implementation- defined by
the local windowing system.
Values not present in this set MUST transition the connection to a terminal
state.
MESSAGE (0)
data : u8[78]
multipart : u8
This corresponds to the MESSAGE command event, with the notable change that if
multipart is set to 1 the MESSAGE is a continuation of the previous one and
should me merged together at a discrete UTF-8 boundary.
IDENT (2)
data : u8[78]
multipart : u8
IDENT changes to dynamic identity of the segment bound window on the channel.
This is used for ancilliary tags to the immutable name provided in REGISTER,
such as the name of a currently open document.
STREAMINFO (6)
identifier : u8
kind : u8
language : u8[4]
This indicates that there are alternate media streams that can be switched
to without creating a new segment or channel. the STREAMSET command is used
to activate the stream specified by identifier.
The kind field can be one of:
0 - Audio
1 - Video
2 - Text
3 - Overlay
The values for the language field follow ISO-639-2 alpha-3 with NUL byte
termination.
STREAMSTATUS (7)
time-string : u8[9]
time-limit : u8[9]
completion : f32
streaming : u8
frame-number : u32
The STREAMSTATUS event MAY be used to convey metadata about the ongoing VSTREAM
transfer on the channel.
The time-string and time-limit fields are NUL terminated 7-bit ASCII in the
HH:MM:SS format showing the current time and total runtime length if
streaming is set to 0 and the time is known.
If the time is unknown but the frame count is known, the completion field
is set to a value in the 0..1 range estimating the percentage
(frame-number / total-frames).
If there is no time information, the streaming field MUST be set to 1.
The framenumber is the sequential monotonic counter of the frame position in
the stream.
STATESIZE (8)
size : u32
type : u32
The STATESIZE event is provided to inform the Sink end that the segment
supports saving and restoring state through an accompanying BSTREAM and
STATE-IN or STATE-OUT target event.
The type field is a custom weak identifier used when there are multiple
segments that support state management independently.
SEGMENT-REQUEST (10)
identifier : u32
width : u16
height : u16
x-offset : s16
y-offset : s16
direction : u8
hints : u8
kind : u8
The SEGMENT-REQUEST event is used to request the local windowing system to
create a new window that the A12 transport can map to a new channel.
The IDENTIFIER is a caller chosen value to be paired with a REQFAIL if the the
request cannot be completed.
The width and height fields specify the preferred initial dimensions, with
x-offset and y-offset the relative position to the parent the request event is
sent through.
The kind field is a type hint about the purpose of the new window. The set
of permitted values are:
Arcan : 1
Media : 2
Terminal : 3
Sensor : 4
Game : 5
Application : 6
Browser : 7
Virtual Machine : 8
Stereoscopic : 9
Popup : 10
Icon : 11
Titlebar : 12
Cursor : 13
Accessibility : 14
Clipboard : 15
Widget : 16
Text-UI : 17
Service : 18
X11 : 19
Wayland : 20
Handover : 21
Audio : 22
Debug : 255
The generic one with a fitting translations in most windowing systems would be
Media (low interactivity, high asymmetric throughput), Application (generic
option when nothing else match), Game (latency over fidelity, timing sensitive
and highly interactive), Browser (complex security model), Virtual Machine
(resizes act as expensive display events, input is device native), Popup
(short-lived, recurring contents with grab and focus semantics) and Terminal
(TPACK format buffers, cell oriented layout, sizing and input binning).
Audio is for providing multiple positioned channels to mix with those of the
parents, and VIEWPORT events becomes a way for spatially positioning the audio.
Icon, Titlebar and Cursor are used to subdivide the parent for custom
decoration and alternate visual identity purposes.
Accessibility and Debug are used to annotate contents of the parent.
The direction field is a window management hint to suggest how the window
should align to the parent, possible affecting its size.
The permitted values are:
0 : don't care
1 : split and position to the left
2 : split and position to the right
3 : split and position above
4 : split and position below
5 : attach to the left
6 : attach to the right
7 : attach above
8 : attach below
9 : set as embedded tab
10 : set as embedded inside window canvas
11 : replace parent until closed
CURSORHINT (12)
name : u8[78]
Cursorhint changes the mouse cursor that the windowing system should apply when
a mouse cursor is over the surface. The suggested 'name' SHOULD match one
[W3CCSS]:
wait, forbidden, grabhint, crosshair, hand, zoom-in, zoom-out, help,
context-menu, typefield, datafield, vertical-datafield, cell, alias, drag,
drag-drop, drag-reject, sizeall, west, east, north, south, west-east,
north-south, north-west, south-west, north-east, south-east,
north-west-south-east, south-west-north-east
With the extensions of:
default, hidden, hidden-rel, hidden-abs
These are used for temporarily disabling the cursor and specifying the
preferred sample format (movement delta or window local coordinates).
This mechanism is also used to warp the cursor when VIEWPORT reanchoring a
CURSOR segment is not preferrable or possible:
hidden-hot:x,y, input:x,y, warp:x,y
With the values encoded in the name.
VIEWPORT (13)
x : s32
y : s32
width : u32
height : u32
parent-identifier : u32
border : u8[4]
embedded : u8
invisible : u8
focus : u8
anchor-edge : u8
anchor-rectangle : u8
external-identifier : u32
The VIEWPORT event is used to request reanchoring relative to a parent on
behalf of the originating channel bound segment OR another one assuming the
correct token identifier can be provided in the external-identifier field.
The border field annotates the number of pixels assigned to the top, left,
right and down edges of the segment has a border that can be cropped away or
used as drag trigger for when there are decorations composited into the
surface.
If anchor edge is set to 1, the x, y are relative to a specific edge of the
parent. The permitted values are:
0 : any
1 : Upper-left
2 : Upper-right
3 : Upper-center
4 : Center-left
5 : Center
6 : Center-right
7 : Lower-left
8 : Lower-center
9 : Lower-right
If anchor-rectangle is set to 1 the width and height fields are applied to
the anchor edge to specify a region of the edge to chose from.
CONTENT-STATE (14)
x-position : f32
y-position : f32
x-size : f32
y-size : f32
cell-width : u8
cell-height : u8
min-width : u8
min-height : u8
max-width : u8
max-height : u8
The CONTENT-STATE event is to indicate where the current VIEWPORT exist for a
window where only parts of the contents is visible. This allows the SEEKCONTENT
command to suggest that the active region should change and to allow sink-end
to provide UI controls, such as scrollbars, to assist interaction.
The x-position, y-position, x-size and y-size are in the 0..1 floating point
range showing the relative percentage in regards to the full (1) window.
The cell-width and cell-height fields provide resizing alignment hints in
surface pixels for content to be provided without cropping outside tile
boundaries.
The min-width, min-height, max-width, max-height provide resize constraints for
how big or small the local rasterisation can handle without causing visual
artifacts.
LABELHINT (15)
label : u8[16]
initial : u16
description : u8[53]
symbol : u8[5]
subid : u16
datatype : u8
modifiers : u16
LABELHINT provides annotation tags for Input events and default keybindings for
the local window system to provide user assistance with avoiding conflicting
keybindings.
The label is a NUL byte terminated ASCII string using the restricted set
[a..Z0..9_] for the label field of the input event indicating the LABELHINT
an input corresponds to.
Description is a short NUL byte terminated UTF-8 encoded GEOHINT language
adjusted description of what the input does.
The symbol is a single NUL byte terminated UTF-8 encoded UNICODE codepoint
that can be used as a visual reference for the label in a user interface.
The subid, datatype and modifiers fields correspond to the default bindings
for the Input as per the corresponding Input event.
REGISTER (16)
title : u8[64]
type : u8
guid : u64[2]
The REGISTER event is only valid PRIOR to receiving an ACTIVATE event. The
title field is a NUL byte terminated UTF-8 encoded immutable user presentable
text identifier.
The type field corresponds to the set of types described in the SEGMENT-REQUEST
event.
The GUID is a 128-byte value packed as 2 64-byte values as per [RFC4122] for
use with remembering windowing system local properties across sesssions.
ALERT (17)
message : u8[78]
multipart : u8
The ALERT event is used to provide a notification of content changes of
immediate importance to the user. The message is multipart terminated (0) and
UTF8-encoded based on preferences from the latest received GEOHINT command.
BCHUNKSTATE (19)
size : u64
input : u8
hint : u8
stream : u8
extension : u8[64]
identifier : u32
The BCHUNKSTATE is used to announce capability- or loss of previously announced
capability- for handling BINARY-STREAM with paired BCHUNK-IN, BCHUNK-OUT
commands.
The hint field suggests the context for triggering the commands, with permitted
values being one out of the following:
1 : immediate
2 : all-data
3 : multipart
4 : cursor
The immediate value suggests that any windowing system local facility for
picking files, such as a save/open dialog, should be triggered immedaitely.
The all-data value indicates that the source has facilities for parsing and
managing any arbitrary octet stream and the extensions field MUST be ignored.
The multipart value indicates that the BCHUNKSTATE should be appended to any
previously received BCHUNKSTATE events.
The cursor value indicates that the event is paired with a cursor drag-and-drop
action.
The extension is a set of UTF8-encoded, NUL byte terminated and ; separated
elements of accepted file extensions as a reduced type model.
The identifier field MUST be used with a REQFAIL command following an immediate
hinted BCHUNKSTATE event.
INPUTMASK (22)
device : u32
type : u32
The INPUTMASK is used to SUGGEST that the sink exclude a subset of input events
from being passed. The device field is a bitmask of the corresponding device
kind, and type is the input type to be excluded. These are described in
[Input](#input).
<a name="example-flow">
# Example Flow and Lifecycle
</a>
The following example works through a full session between a source and a sink
from authentication to interaction and data exchange with recovery from
unhandled compression.
The direction for connection initiation can be swapped based on context and
needs.
1. Source binds TCP port 6680 and listens for inbound connections.
2. Sink generates ephemeral keypair and connects to the IP and PORT of the
Source.
3. Sink sends the ephemeral-round HELLO command with version number and role.
4. Source receives the HELLO command, verifies version number and that the
role is a Sink.
5. Source generates ephemeral keypair, sends an ephemeral HELLO command and
derives the ephemeral session keys.
6. Source receives the HELLO command and derives the ephemeral session keys.
7. The HELLO handshake is repeated using the derived session keys, providing
the real public keys.
8. Source performs local window system dependent operation to access the
software to share and sends a REGISTER event.
9. Sink provides the initial set of target commands, terminating with an ACTIVATE
event.
10. Source sends DEFINE-VIDEO-STREAM with parameters according to the software
that has been shared and marks the encoding as H264. It then starts
compressing video frames as they are received from the local software.
11. Sink receives the DEFINE-VSTREAM command, notes that it doesn't support the
compression format used and sends a STREAM-CANCEL command with the reason
that the format is unsupported. It discards any video frame packets
belonging to the stream that may be in flight.
12. Source recieves the STREAM-CANCEL and resubmits the frame in the lossless
ZSTD compressed full/delta frame format.
13. Sink receives the stream, unpacks into the local windowing system, ensuring
that the window dimensions match those of the received frame.
14. When Sink receives device input or events with a matching translation in
the event model from the local windowing system, it repackages them and
formats as Event packets.
15. Source unpacks and forwards event packages as they arrive, making sure to
quickly convert any frames they might produce.
16. When the user decides to close the window, the Sink sends a SHUTDOWN
command on the channel and both ends close the TCP socket.
<a name="directory-extension">
# Directory Extension
</a>
The 'role' specified during authentication can, as mentioned, be source, sink,
probe or directory. The directory one is an extension to the base protocol
which adds alternate context interpretation of some events, as well as adding a
handful of new commands.
This extension is experimental, and some commands may be modified with
revisions to this document.
The purpose of the directory is to work as rendezvous for discovery, state
storage, transform and messsaging for your fleet of a12 capable clients.
A client connected to a directory with the 'sink' role MAY be permitted to
list as an available data sink. Consequently, a client connected with the
'source' role MAY be permitted to list as an available data source.
A client connected as probe may use the LIST command (9):
notify : u8
If notify is set to 1, the directory MAY send updates to the index of available
directory resources at any time.
The directory server SHOULD reply to a LIST with zero or more DIRECTORY-STATE
(10) and zero or more DIRECTORY-DISCOVER (11) commands.
DIRECTORY-STATE (10) contains the following fields:
identifier : u16
flags : u16
reserved-2 : u16
checksum : u8[4]
size : u64
name : u8[16]
description : u8[94]
Identifier is a directory- local identifier for an appl. An Appl is a set of
Lua scripts and ancilliary resources packaged according to the FAP format
below. The Identifier (0) is reserved and it is RECOMMENDED that the directory
allocate identifiers incrementally from 1 and onwards for available appls.
The flags field is a bitmap of server- provided attributes:
1 : default - The recommended download for a client.
2 : tui - The FAP package contains a text-only application.
This is provided to let an automated setup, e.g. booting a premade OS image
where one might not want to embed an explicit name.
DIRECTORY-DISCOVER (11) contains the following fields:
role : u8
state : u8
petname : u8[16]
public-key : u8[32]
namespace : u16
With role indicating 0 for a source, 1 for a sink and 2 for a linked directory.
The state field is set to:
0 : removed
1 : added
2 : added-immediate
The petname match the petname provided when the other end negotiated its
connection as per the HELLO command. It is possible for the directory to
provide multiple entries for the same petname using different public-keys. This
is useful when there is on-demand load balancing provisioning of sources.
The namespace is set to 0 UNLESS the discovered resource is bound to an appl
that the client has previously joined. This means that the source or sink is
intended for that context only and can be routed to the local runner of that
appl directly.
The 'added-immediate' notification state indicates that this is a resource
only directed to the recipient and SHOULD be opened as soon as possible.
To access an announced source, sink or directory, the DIRECTORY-OPEN (12)
command can be used:
mode : u8
public-key: u8[32]
This requests that the directory server provides a connection to the source,
sink or linked directory, accepting the public-key provided. Three different
modes are supported and can be provided as a bitmap:
1 = inbound
2 = outbound
4 = tunnel
The mode chosen depends on reachability of the two endpoints and the selection
is up to the discretion of the directory implementation. Of particular note is
the 4th mode where the pre-existing directory connection will be used to tunnel
a connection between the two.
The directory server MUST respond to an OPEN command with a 'DIRECTORY-OPENED'
command (13). This command carries the following fields:
status : u8
address : u8[46]
port : u16
secret : u8[12]
public-key : u8[32]
Status can be one of:
0 indicating failure
1 direct inbound connection
2 direct outbound connection
3 tunnel
The connection can fail (status = 0) if there is no negotiable solution
for the two endpoints to reach eachother, or if the one or the other has
disconnected while the request was made.
The address field will carry an ascii encoded IPv4 or IPv6 address in the case
of an inbound or outbound connection, or a directory- local tunnel ID if the
directory server will act as a tunnel.
The provided public-key is the key the other end will use to initiate the
connection, and the secret will be used as passphrase for authenticating the
initial HELLO command.
The directory MUST use a UNIQUE cryptographically secure pseurandom number
generator for generating the secret. The single exception is if the entry to be
opened is a referential link to a different directory server that may not
support an authentication secret. In that case the secret will be set to
'SETECASTRONOMY', matching the a12 connection default.
The negotiated tunnel identifier corresponds to a channel, and transfers
to/from the other end comes as a BINARY packets across that channel. Any
other activity on that channel MUST be ignored.
To terminate a tunnel relay session, either of the three parties (SOURCE, SINK,
DIRECTORY) issues a DROP-TUNNEL (14) command with the matching identifier as
the payload.
<a name="file-transfer">
## File Transfer
</a>
The DEFINE-BINARY-STREAM command is used to initiate a binary transfer as per
[Section 5.4, Binary](#binary). For regular file transfers, the pairing event
used to transfer name and storage and as a trigger for creating the stream is
BCHUNKSTATE, which MUST carry a namespace selection identifier, a request
identifier, a unique name as the 'extensions' part of the event and desired
direction.
The namespace identifier corresponds to (0 = private) or a valid APPLID
provided as a response or notification following the LIST command. Any name
starting with a dot '.' is reserved for protocol use.
The name '.index' SHALL be used to transfer a list of files available in the
namespace. The format for the .index is encoded as a number of line separated
entries using UTF-8 encoded key[=value] with : as separator between keys.
If a requested name does not exist (for download), or there is insufficient
permission (for upload or download), the directory server MUST respond with a
TARGET\_COMMAND\_REQFAIL event with the corresponding request identifier.
If the request is permitted, the directory server MUST initiate the transfer
through the DEFINE-BINARY-STREAM command.
The reserved '.appl' name is reserved for accessing the directory server appl
store in order to update the client side portion of an appl. This is a
feature for a developer and should have strict key-bound access control.
The reserved '.ctrl' name is reserved for accessing the directory server side
controller store in order to update the controller side portion of an appl.
This is a feature for a developer and should have struct key-bound access
control.
The reserved '.monitor' name is reserved for attaching an implementation
defined debug interface for the specified namespace. This is a feature for a
developer (ns != 0) or administrator (ns == 0) and should have struct key-bound
access control.
The reserved '.applhost' name is reserved for requesting that the directory
host running the applid. If supported, a corresponding source-immediate
notification will be sent.
The reserved '.tui' name is reserved for downloading an alternate
implementation
An empty name and a valid identifier is a request to download necessary data to
run an appl locally. This should trigger a server side initiated binary stream
with (optionally) type STATE and (necessary) type APPL.
<a name="linking">
## Linking
</a>
It is possible to link multiple directory servers together to form a larger
network of servers. Such a connection is indicated in the initial HELLO command
specifying a role as either 'directory' or 'directory-reference'.
Linked directories comes in two forms, unified and referential.
A unified link means that the directory servers cooperate to provide a single
namespace of resources and authentication. This is done in order to increase
redundancy, balance load and optimise network transfers.
A referential link means that the outbound client can route and tunnel local
clients to the linked directory. This is done in order to provide discovery.
Such a link SHOULD also be announced to local sinks as DIRECTORY-DISCOVER
commands.
<a name="fap-format">
## FAP format
</a>
FAP - (Format, Arcan, Package) is used to package an appl as desribed in the
DIRECTORY-STATE command. These use the same key/value encoding scheme as with
.index files and metadata bstream type, where each entry MUST contain a 'path'
entry, a 'name' entry, a 'size' entry, a 'hash' entry and a 'perm' entry.
It MAY also have a leading 'sign' and 'ksig' entry for signature verification
as per REKEY mode=1.
It MAY also have a 'tui' entry to indicate that the appl is text-only.
The 'size' entry MUST be set to a string encoded value of the number of bytes
belonging to path/name to consume from the bytestream unencoded. This will be
followed by a new entry until there are no more bytes left to consume.
The 'hash' entry value MUST be a base64 encoded 16 byte BLAKE3 hash of the full
FAP file excluding the header.
The 'perm' entry MUST be either 'restricted' or 'desktop' which determines the
set of local applications that the packaged appl is allowed to access. The
default is 'restricted' and such an appl is not permitted to launch local
clients or frameservers that would provide non-interactive access to wider
system resources.
<a name="discovery">
# Discovery Extension
</a>
A12 has an optional broadcast domain discovery protocol. It is intended to work
inside the message domain of a network of directory servers, in the broadcast
domain of an IPv4 network and multicast domain of an IPv6 network.
Discovery here means establishing a network path to an entity where a previous
authenticated relationship exists. The entity that should be 'discovered'
issues a beacon, and entities that should affirm knowledge of this entity
replies to a beacon given certain prerequisites.
## Beacon
The beacon follows the format of an 8 byte NONCE which comes from a CSPRNG
source combined with a set of X25519 public keys that are NONCE, H(NONCE, Kpub)
stacked together.
The beacon is sent in the broadcast domain, wait for 1 second then send a
beacon as (NONCE+1, H(NONCE + 1, Kpub1) .. H(NONCE, Kpub..n)). This provides a
'proof of elapsed time' that is more expensive for the source of the beacon to
calculate, than for the recipient to verify.
The recipient of a beacon sweeps its keystore, looking for matches to H(NONCE,
Kpub) to pair to a known petname-Kpub pair.
A number of Kpub identities can be packed in the same beacon as the
authentication setup and reference tooling encourages differentiation of
multiple keypairs to one identity.
## Beacon Response
A device that sees a valid beacon pair with a valid timeout checks its known
keystore and calculates H(NONCE, Kpub) for a match. If there is one, it
calculates the H(NONCE, Self.Kpub) for the public key used to establish the
identity in the past to the device in question.
It sweeps the keystore for any Kpub that match, and can use that to initiate
a direct connection and/or alert outer user interfaces that the paired petname
has been discovered.
This scheme ensures that it takes at more time to calculate a beacon pair over
a keyset than it takes to verify it, with no amplification to bytes in flight
on an attempt to spoof.
<a name="reference-implementations">
# Tools and Reference Implementations
</a>
The repository at fossil.arcan-fe.com contains libraries and command-line
tools which act as the reference implementation for the protocol.
The components involved are:
* libarcan-a12 - protocol implementation
* libarcan-shmif-server - a local IPC system for acting as a building block
to a display server, sharing event model with the protocol.
* libarcan-shmif - the client end to libarcan-shmif-server
<a name="future-changes">
# Future Changes
</a>
The following additions are planned, primarily to the Directory extension:
* PQ safe hybrid mode - Using REKEY and a binary stream to transfer ML-KEM 768
credentials (https://openquantumsafe.org/liboqs/algorithms/kem/ml-kem.html)
<a name="acknowledgements">
# Acknowledgements
</a>
Parts of this work was funded by the NGI0 Entrust fund administered by NLnet
and supported through the European Commission 'Next Generation Internet'
Programme.
Work on the specification and reference tooling has been provided by Bjorn
Stahl and Valts Liepins.
<a name="references">
# References
</a>
[HKDF]: https://eprint.iacr.org/2010/264 "Cryptographic Extraction and Key
Derivation: The HKDF Scheme", Proceedings of CRYPTO 2010
2010, Krawczyk, H.
[BLAKE3]: https://www.ietf.org/id/draft-aumasson-blake3-00.html "The BLAKE3
Hashing Framework", Aumasson, J-P. Neves, S. O'Connor, J.,
Wilcox, Z.
[ZSTD]: https://datatracker.ietf.org/doc/html/rfc8878 "Zstandard Compression
and the 'application/zstd' Media Type", Colette, Y.
[H264]: https://www.loc.gov/preservation/digital/formats/fdd/fdd000081.shtml "
MPEG-4, Advanced Video Coding (Part 10) (H.264)"
[CHACHA20]: https://datatracker.ietf.org/doc/html/rfc7539 "ChaCha20 and Poly1305
for IETF Protocols", Nir, Y., Langley, A.
[TOOMUCHCRYPTO]: https://eprint.iacr.org/2019/1492 "Too Much Crypto",
Aumasson, J-P
[SDL2-CODES]: https://github.com/libsdl-org/SDL/blob/SDL2/include/SDL_keycode.h
[SHMIF]: https://arcan-fe.com/2024/11/21/a-deeper-dive-into-the-shmif-ipc-system, "
A deeper dive into the SHMIF IPC system", Stahl, B.
[ARCAN]: https://arcan-fe.com
[XLIBREF]: "XLIB Reference Manual R5", Nye, A.
[RFC4254]: https://datatracker.ietf.org/doc/html/rfc4254 "The Secure Shell (SSH)
Connection Protocol", Ylonen, T., C. Lonvick
[RFC6143]: https://datatracker.ietf.org/doc/html/rfc6143 "The Remote Framebuffer
Protocol", Richardson, T., Levine, J.
[RFC4122]: https://datatracker.ietf.org/doc/html/rfc4122 "A Universally Unique
IDentifier (UUID) URN Namespace", Leach, P., Mealing, M., Saltz, R.
[RFC8081]: https://datatracker.ietf.org/doc/html/rfc8081 "The 'font'
Top-Level Media Type", Lilley, C.
[SPICE]: https://www.spice-space.org/spice-protocol.html "The SPICE Protocol"
[W3CCSS]: https://www.w3.org/TR/CSS21/ui.html "Cursors: the cursors property"
[MSRDP]: https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-rdpbcgr
"Remote Desktop Protocol"
[HASHCASH]: http://www.cypherspace.org/adam/hashcash/ A. Back: Hashcash. 1997.