Kestrel-3

Check-in Differences
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Difference From da8e336246ef213b To 1f7133f0d8dd8e07

2020-01-26
23:03
Choose better name for section check-in: 34a0e063d2 user: kc5tja tags: makerom
20:40
Inserted a lot of explanation behind why I chose the magic constants that I did. They probably don't belong here. They probably need to reside in a separate reference document. But, at least I have them under revision control somewhere. check-in: 1f7133f0d8 user: kc5tja tags: makerom
18:50
Update brickie check-in: da8e336246 user: kc5tja tags: makerom
18:49
Introduce doc and code directives to make "weaving" easier. check-in: a28155cb66 user: kc5tja tags: trunk
18:09
Starting work on makerom utility check-in: a06942ac8e user: kc5tja tags: makerom

Changes to dev/src/makerom/01-00100-intro.

1


























































































































2
3
4
5
6
7
8
9
10
11
12
13
14
15
16





17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32






33
34
35
36
37
38
39
(:doc)


























































































































## Example Load File

Let's suppose we are given the following load file contents to work with.
For our current purposes, we are going to assume just three segments in the order listed below.

- a HUNK_CODE segment.  The HUNK_CODE segment will contain the "program instructions" that we want to burn into ROM.  For now, they're garbage data which is designed to be easy to interactively confirm proper operation of the `makerom` tool.
- a HUNK_RODATA segment.  The HUNK_RODATA segment will contain the passive data that our "program" depends on to run.
- a HUNK_BSS segment.  The HUNK_BSS segment basically identifies which segment will refer to RAM-resident storage.  This will be used for address fix-up purposes later on.

We'll start by declaring the load file header,
which tells the loader how many segments it has to deal with,
and how large each of them will be.
(It also has a few other features which are irrelevant to `makerom`.)

(:top)





\ begin load file contents

CREATE loadfile
  HUNK_HEADER ,
  0 ,  ( no shared libraries referenced )
  3 ,  ( we have three segments in the load file )
  0 ,  ( the first segment index is 0 )
  2 ,  ( the last segment index is 2 )
  \ sizes...

\ end load file contents

(:doc)
The first hunk the loader should attempt to place is the HUNK_CODE hunk.
In this example, I'm just going to use 8 dwords of easily discoverable patterns,
so that we can confirm proper `makerom` operation through visual inspection.







(:before end load file contents)
  HUNK_CODE ,
  8 ,  ( 8 dwords of "program" code follows )
HEX
  1111111111111111 ,
  2222222222222222 ,

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>















>
>
>
>
>
















>
>
>
>
>
>







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
(:doc)
## Hunk Types

A hunk format load file can contain any number of different kinds of hunks.
Most hunks have IDs assigned arbitrarily; in general, the specific number assigned doesn't matter.
However, for code, data, and BSS segments, hunk IDs do follow a pattern.

As I write this document, code and data hunk IDs range from 100000 to 100007.
If you subtract 100000 from the hunk ID, you will end up with a number in the range of 0 to 7;
this conveniently maps to the Unix-style R, W, X permission bits.
These permission flags tell the loader how the segment is intended to be used.
The table below indicates the kinds of segments that make sense:

| R | W | X | Type  | Notes |
|:-:|:-:|:-:|:------|:------|
| 0 | 0 | 0 |       |Reserved.|
| 0 | 0 | 1 |XOCODE |Use for code which does not need to read its own content as data.|
| 0 | 1 | 0 |       |Reserved.|
| 0 | 1 | 1 |       |Reserved.|
| 1 | 0 | 0 |RODATA |Read-only Data.  Contains constants, look-up tables, etc.|
| 1 | 0 | 1 |CODE   |General purpose block of program code.  Code can read its own content as data.|
| 1 | 1 | 0 |DATA   |Read-write Data.  Used to contain initialized, but nonetheless mutable, data structures and variables.|
| 1 | 1 | 1 |       |Reserved.|

### XOCODE versus CODE

The distinction between CODE and XOCODE segments is primarily of interest to operating systems which employ page-level protected memory mechanisms.
For `makerom`'s purposes, these two segment types are treated the same.
But, why have both to begin with?

As of this writing, the official RISC-V instruction set architecture lacks the ability
to load a register with an arbitrary constant larger than a signed, 12-bit value in a single instruction (with some pathological exceptions).
For example to load a value between -2048 and 2047, you can perform this instruction:

    addi  t0,x0,N  ; -2048 <= N < 2048

For values of N which range between -2^31 <= N < 2^31 *and* where N is a multiple of 4096, you could get away with:

    lui   t0,PAGE32(N)

For all other numeric constants, however, you must use more than one instruction.
For any generalized signed 32-bit value, for instance, you can combine the two instructions above like so:

    lui   t0,PAGE32(N)
    addi  t0,t0,OFFSET(N)

For values larger than 32-bits, you need to synthesize the constant somehow.
One approach used by C compilers on Linux and BSD systems is to repeatedly shift in bits:

    addi  t0,x0,bits        ; contribute the first 12 bits
    slli  t0,t0,11
    ori   t0,t0,more_bits   ; next 11 bits (bit 12 must be 0 b/c of sign extension)
    slli  t0,t0,11
    ori   t0,t0,more_bits   ; next 11 bits...
    slli  t0,t0,11
    ...etc...

The exact sequence of instructions will obviously depend on the assembler or compiler used **and** on the number being encoded;
but, it's clear that, in the worst case, the process will be slow and tedius,
shifting just shy of 1.5 bytes of content per pair of instructions.
Thankfully, this doesn't happen frequently, and software performance is not generally impacted.

To work around this limitation, you could store your large constants in a *constant pool*.
This is a separate chunk of memory reserved for this purpose, and permanently referenced using a well-known CPU register.
Typically, `gp` (globals pointer) is used for this purpose; but, RISC-V being RISC, it could really be any CPU register your ABI wishes.

    ld    t0,k_offset(gp)

The only disadvantage of this approach is that
`gp` must be pre-initialized *before* invoking any procedure or function in your module.
This greatly complicates language ABIs;
a "function pointer" in C no longer is a single value, but rather
a pointer to a small structure which *itself* points to both the intended function *and* contains the expected `gp` value for that function.
(Compare this against Intel-style "far" memory pointers and global/local descriptors;
in this case, `gp` serves a role analogous to the `CS` register in x86-32 architectures.)

Observe that these techniques **do not** reference the program space as data;
therefore, all of these approaches will work in both CODE and XOCODE segments.

I am a creature of convenience, however;
I prefer to instead store my pool of large constants close to the code that uses them,
and simply read them as data when required.

        dword big_number_here
    _procedure:
        auipc t0,0
        ld    t0,-8(t0)

Since this approach obviously treats code space as data,
this semantic encoding will not work in an XOCODE segment
(at least, when loaded into an OS which uses page-level memory protection).
The moment the CPU executes the `ld` instruction on an XOCODE segment in memory,
a page fault trap will be raised.

However, the benefit includes not having to worry about obscure ABI mechanations,
simulating segmentation, etc.
When in doubt, place your data in a CODE hunk.
Use XOCODE hunks only if you are certain they are clean of memory fetches.

### RODATA versus DATA, BSS

Unlike CODE vs XOCODE hunks,
`makeromm` *does not* treat BSS, DATA, and RODATA hunks the same way.

RODATA hunks are used to hold *immutable*, pre-initialized data;
because they are immutable, they may safely reside with (XO)CODE hunks in a ROM image.
BSS and DATA hunks, however, hold data which *could be* mutated, even if the software never actually changes it.
For this reason, both BSS and DATA hunks must reside in RAM, not in ROM.

For these reasons, `makerom` treats RODATA hunks the same as CODE and XOCODE hunks.
However, it will exit with an error if it finds a DATA hunk in your input load file.
Yet, it will happily process a load file that has a single BSS hunk.

The reason why BSS gets special dispensation has to do with the semantic differences between DATA and BSS hunks.
Both DATA and BSS hunks need to reside in RAM; however,
DATA hunks are *pre-initialized*, while BSS hunks are *uninitialized*.
Since BSS hunks are uninitialized, they take *no space* in either the load file or in ROM.
Therefore, `makerom` is able to fully support address relocations into BSS once it's been given the location of its BSS segment.

**NOTE:** I have aspirations for a future version of `makerom`
that supports DATA hunks by requiring the ROM code to copy the ROM-resident image of the DATA segment into RAM first.
However, I need to study in more detail how to make this work.

## Example Load File

Let's suppose we are given the following load file contents to work with.
For our current purposes, we are going to assume just three segments in the order listed below.

- a HUNK_CODE segment.  The HUNK_CODE segment will contain the "program instructions" that we want to burn into ROM.  For now, they're garbage data which is designed to be easy to interactively confirm proper operation of the `makerom` tool.
- a HUNK_RODATA segment.  The HUNK_RODATA segment will contain the passive data that our "program" depends on to run.
- a HUNK_BSS segment.  The HUNK_BSS segment basically identifies which segment will refer to RAM-resident storage.  This will be used for address fix-up purposes later on.

We'll start by declaring the load file header,
which tells the loader how many segments it has to deal with,
and how large each of them will be.
(It also has a few other features which are irrelevant to `makerom`.)

(:top)
\ begin hunk format magic constants

99999 CONSTANT HUNK_HEADER

\ end hunk format magic constants
\ begin load file contents

CREATE loadfile
  HUNK_HEADER ,
  0 ,  ( no shared libraries referenced )
  3 ,  ( we have three segments in the load file )
  0 ,  ( the first segment index is 0 )
  2 ,  ( the last segment index is 2 )
  \ sizes...

\ end load file contents

(:doc)
The first hunk the loader should attempt to place is the HUNK_CODE hunk.
In this example, I'm just going to use 8 dwords of easily discoverable patterns,
so that we can confirm proper `makerom` operation through visual inspection.

(:before end hunk format magic constants)
100001 CONSTANT HUNK_XOCODE
100004 CONSTANT HUNK_RODATA
100005 CONSTANT HUNK_CODE
100006 CONSTANT HUNK_DATA

(:before end load file contents)
  HUNK_CODE ,
  8 ,  ( 8 dwords of "program" code follows )
HEX
  1111111111111111 ,
  2222222222222222 ,
71
72
73
74
75
76
77



78
79
80
81
82
83
84
85
86
87
88
89



90
91
92
93
94
We haven't discussed relocations and address fix-ups yet; we'll get to these later.
But, for now, just know that it exists.

Also, notice that we do not include any content for the BSS segment.
BSS ([Block Started by Symbol](https://en.wikipedia.org/wiki/.bss)) segments are intended to hold *uninitialized* data.
It is the responsibility of the software in the code segment to pre-initialize this block if required
(which it usually is).




(:before end load file contents)
  HUNK_BSS ,
  32 ,  ( 32 dwords of uninitialized variables exist "somewhere" )

(:before sizes...)
  32 8 * ,  ( BSS hunk is 256 bytes in size, but observe we don't specify them! )

(:doc)
The final hunk is a sentinel, intended to cause the graceful exit of a program loader.
If reading from a file, an end-of-file condition can be used to signal the end of the program stream as well;
however, as our tests will be reading from memory, we need the sentinel to know where to stop reading.




(:before end load file contents)
  HUNK_END ,









>
>
>












>
>
>




<
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232

We haven't discussed relocations and address fix-ups yet; we'll get to these later.
But, for now, just know that it exists.

Also, notice that we do not include any content for the BSS segment.
BSS ([Block Started by Symbol](https://en.wikipedia.org/wiki/.bss)) segments are intended to hold *uninitialized* data.
It is the responsibility of the software in the code segment to pre-initialize this block if required
(which it usually is).

(:before end hunk format magic constants)
100016 CONSTANT HUNK_BSS

(:before end load file contents)
  HUNK_BSS ,
  32 ,  ( 32 dwords of uninitialized variables exist "somewhere" )

(:before sizes...)
  32 8 * ,  ( BSS hunk is 256 bytes in size, but observe we don't specify them! )

(:doc)
The final hunk is a sentinel, intended to cause the graceful exit of a program loader.
If reading from a file, an end-of-file condition can be used to signal the end of the program stream as well;
however, as our tests will be reading from memory, we need the sentinel to know where to stop reading.

(:before end hunk format magic constants)
99998 CONSTANT HUNK_END

(:before end load file contents)
  HUNK_END ,