Kestrel-3

Files in dev/src/bcpl/assemrv/ of 5594d490f946dfa4
Login

Files in directory dev/src/bcpl/assemrv of check-in 5594d490f946dfa4


The assemrv command is the RISC-V RV64I assembler for the Kestrel Operating
System (KOS).  It is intended to be used in a command-command file to help
convert BCPL source to native RV64 object code.  For example:

	.K MODULE/A
	bcpl2sial from <MODULE>.b to <MODULE>.sial
	sial2rv64 from <MODULE>.sial to <MODULE>.asm
	assemrv from <MODULE>.asm to <MODULE>
	delete <MODULE>.sial <MODULE>.asm

Additional features supporting first-class assembly language programming may be
added at a later time; refer to the documentation and/or change logs for more
details.

# Directives

Unless otherwise documented, the directives that assemrv supports are taken
from the listing at https://rv8.io/asm .

Supported directives:

| Directive               | Purpose                                                                                                                   |
|-------------------------+---------------------------------------------------------------------------------------------------------------------------|
| .2byte e0[,e1[, ...]]   | Emits 16-bit words for e0, then e1, etc.  Does NOT align before-hand.                                                     |
| .4byte e0[,e1[, ...]]   | Emits 32-bit words for e0, then e1, etc.  Does NOT align before-hand.                                                     |
| .8byte e0[,e1[, ...]]   | Emits 64-bit words for e0, then e1, etc.  Does NOT align before-hand.                                                     |
| .half e0[,e1[, ...]]    | Emits 16-bit words for e0, then e1, etc.  DOES align before-hand.                                                         |
| .word e0[,e1[, ...]]    | Emits 32-bit words for e0, then e1, etc.  DOES align before-hand.                                                         |
| .dword e0[,e1[, ...]]   | Emits 64-bit words for e0, then e1, etc.  DOES align before-hand.                                                         |
| .byte e0[,e1[, ...]]    | Emits 8-bit bytes for e0, then e1, etc.  Expressions may be strings.  Useful for constructing bcpl-style counted strings. |
| .ascii "string"[, ...]  | Emits a string without any count or termination.  (Synonym to .byte)                                                      |
| .asciz "string"[, ...]  | Emits a null-terminated C-style string.  (Synonym to .string)                                                             |
| .string "string"[, ...] | Emits a null-terminated C-style string.                                                                                   |
| .zero e                 | Emits zero bytes of length e.                                                                                             |
| .p2align e[,p]          | Enforce alignment to 2^e.  Fill padding bytes with p if specified; else 0.                                                |
| .balign e[,p]           | Enforce alignment to the next multiple of e.  Fill padding bytes with p if specified; else 0.                             |
| .equ sym,e              | Creates a label whose value evaluates to e.                                                                               |
| .include "filename"     | Includes contents of specified file at the inclusion point.                                                               |
| .file "filename"        | Sets the logical filename (useful when debugging code generated by other tools)                                           |
| .line e                 | Sets the logical line number (useful when debugging code generated by other tools)                                        |

Unsupported directives may or may not be accepted as no-operations,
depending on the context of the directive.

The following directives are unique to the KOS/Tripos environment:

| Directive | Purpose                                              |
|-----------+------------------------------------------------------|
| .gv e0,e1 | Sets global vector element e0 to the value e1.       |
| .gref e   | Specifies the largest referenced global vector slot. |

# HUNK FORMAT DETAILS

This assembler produces hunk format executables compatible with the loadseg()
and globin() DOS calls.  This section details the specific variant used with
the Kestrel Operating System.

A simple BCPL example as used in cintsys/cintpos does not rely on linkage
editors.  Likewise, the emitted SIAL is fully position independent.  Therefore,
programs tend to consist only of a single combined text and data segment.

	+-------------+
	| HUNK_CODE   |
	+-------------+
	| N0          |  (number of 64-bit words comprising hunk 0)
	+-------------+
        |             |
	|  .....      |
	|             |
	+-------------+
	| HUNK_END    |
	+-------------+

The contents of the HUNK_CODE hunk contains the machine language bytes
comprising the body of the program.  To support the globin() function, however,
a table of global vector initializers appears at the end of the hunk, like so:

	+-------------+
	| HUNK_CODE   |
	+-------------+
	| N0          |  (counts both machine language and global vector data!)
	+-------------+
        |             |  (machine language code and data goes here)
	|  .....      |
	|             |
	+-------------+
	| 0           |  (end of global vector initialization list)
	+-------------+
	| VALUE_n     |  (value to stuff into Gn)
        +-------------+
	| Gn          |  (global vector element index n)
	+-------------+
	| ...         |
	+-------------+
	| VALUE_0     |  (value to stuff into G0)
        +-------------+
	| G0          |  (global vector element index 0)
	+-------------+
	| LGn         |  (largest referenced global vector element)
        +-------------+

If your assembly language module lacks any global vector entries, then it
should simply have a null global vector initializer list:

	+-------------+
	| HUNK_CODE   |
	+-------------+
	| N0          |
	+-------------+
        |             |  (machine language code and data goes here)
	|  .....      |
	|             |
	+-------------+
	| 0           |  (end of global vector initialization list)
	+-------------+
	| LGn         |  (largest referenced global vector element)
        +-------------+

If the software in the section makes no references to the global
vector, LGn must be set to 0.  Otherwise, it must be set to the
largest global vector index referenced either in machine language or
in the global vector initialization list.

# Support for Multiple Sections

This will require a more sophisticated implementation of both the file format
and the corresponding support in loadseg().  Cintpos is implemented entirely
without this level of support, opting to use the global vector for all linkage
at run-time.  For the time being, therefore, we opt to elide support for
multiple sections in a single file.

However, longer-term future plans will almost certainly require support for
multiple sections.  When this becomes a requirement, the specification here
will be updated accordingly, as will the KOS implementation of loadseg().  For
example, the current format is insufficiently expressive to support building
ROM-able code.  Proper support for ROM-resident code requires support for not
just multiple sections, but different types as well (code vs data vs read-only
data, etc.).