Memory Map

Memory Map (E3, Headless/Single-Chip Design)

There is one important constraint which wasn't originally considered in the previous Kestrel-3 memory map designs. As odd as it may sound, the lack of standardization for where RISC-V processors can boot in the physical address space has profound implications for the memory map design of the computer. Since I've always intended the Kestrel-3 to be upgradable to better and faster processors than anything I can come up with on my own, it follows that my design must be accommodating to processors potentially not yet designed.

The vast majority of RISC-V processors (ignoring register widths) currently in existence boots from somewhere in low memory, although my own designs boot from high memory. (The RISC-V Privileged Specification v1.9 allowed a core to boot from either low or high memory.)

Core Address of First Instruction Configurable upon Instantiation?
PicoRV32 $00000000 Yes
Ariane $0000000000010000 ?
f32c $00000000 Yes
ORCA $0000000000000000 Yes
riscv_vhdl $00000000 ?
Rocket $0000000000010000 No

This quick and unscientific survey tells me that the Kestrel-3 has its best chances of supporting arbitrary RISC-V processors if we reserve pages of memory around page 0 and page -1 exclusively for use by a "processor card," where a "card" is defined as the circuit which allows an arbitrary RISC-V core to interoperate with the rest of the Kestrel-3 hardware. (The term "card" here is borrowed from the Commodore-Amiga's long history of providing processor local bus slots which can override the motherboard-resident processor. These were used, for instance, to upgrade 68000-based Amiga 2000 systems to 68030 to 68060 processors, and as well, Amiga 3000/4000 systems similarly, without having to pull the original CPU out first. "Card", in the context of an FPGA circuit, is perhaps anachronistic; however, it becomes relevant again if the signals are exposed to the outside world.)

The memory layout of the Headless Kestrel-3 was designed with easy emulation (especially by another RV64 processor) in mind. Some emulators are entirely software-defined (e.g., E2 and E3), while others can be implemented with hardware assistance (e.g., running a virtual Kestrel-3 on a Rocket core using page tables to mimic the memory map described here).

There are only four broad categories of devices visible to the CPU: Resources local to the CPU card itself, Kestrel-3 ROM, Kestrel-3 RAM, and Kestrel-3 I/O devices. These four broad categories are placed on Sv39 mega-page boundaries.

MegaRPN From To Device
0 0000_0000_0000_0000 0000_0000_3FFF_FFFF CPU Card Local
1 0000_0000_4000_0000 0000_0000_7FFF_FFFF Kestrel-3 ROM
2 0000_0000_8000_0000 0000_0000_BFFF_FFFF Kestrel-3 Standard I/O
3 0000_0000_C000_0000 0000_0000_FFFF_FFFF Kestrel-3 RAM; I/O expansion space
4-510 0000_0001_0000_0000 FFFF_FFFF_BFFF_EFFF unmapped; Kestrel-3 I/O Expansion Space

For the Kestrel-3 design targeting the icoBoard Gamma FPGA board, the following refinement of the above memory map will apply:

MegaRPN From To Device
0 0000_0000_0000_0000 0000_0000_0003_FFFF FPGA bitstream ROM
0 0000_0000_0004_0000 0000_0000_000F_FFFF KCP53000B M-mode Driver and Bootstrap
0 0000_0000_0010_0000 0000_0000_3FFF_FFFF unmapped
1 0000_0000_4000_0000 0000_0000_400F_FFFF Kestrel-3 ROM (DX-Forth v1.2/E3)
1 0000_0000_4010_0000 0000_0000_7FFF_FFFF unmapped
2 0000_0000_8000_0000 0000_0000_8000_0FFF SIA #1 (User's Console)
2 0000_0000_8000_1000 0000_0000_8000_1FFF SIA #2 (Mass Storage)
2 0000_0000_8000_2000 0000_0000_BFFF_FFFF unmapped
3 0000_0000_C000_0000 0000_0000_C00F_FFFF RAM
3 0000_0000_C010_0000 0000_0000_CFFF_FFFF unmapped
4-511 0000_0001_0000_0000 FFFF_FFFF_FFFF_EFFF unmapped

Unmapped Regions

All unmapped regions (in physical hardware at least) will generate either an access fault or a page fault, depending on whether or not the MMU is enabled. If the MMU is enabled, the virtual machine monitoring the virtual Kestrel-3 process should reflect the page fault back to the Kestrel-3 firmware as an access fault.

I/O Devices

You'll notice that I/O devices appear on 4KiB page boundaries. This is intended to facilitate operating systems which intend to expose specific I/O devices to specific user-mode processes. E.g., a console driver running in a microkernel would run in user-space, and have access to SIA #1 from within user-space. However, page protections will ensure that SIA #2 will be off-limits, for that resource is almost certainly going to belong to the filesystem driver.

SIA: Serial Interface Adapter

| INTENA |  RXINP |  STAT  |  TXOUT | +0
|                BAUD               | +4


    +0   xxxxxxxx   W   Transmitter Data Register

This register is write-only. If you attempt to read from this register, the value returned is not specified.

This write-only 8-bit register accepts data to send serially to another peripheral. Bits are transmitted using one start-bit, 8-data bits (LSB first), no parity, and finally one stop-bit.

If the SIA is busy transmitting a byte when another byte is written to this register, the hart will block until the shift register is ready to accept the latter byte. Thus, it is not necessary for software to explicitly poll the SIA to see if it's ready before sending another byte, as the following example illustrates.

        ; A0 -> buffer, A1 = length of buffer, A2 -> SIA base

        lb    t0,0(a0)
        sb    t0,0(a2)
        addi  a0,a0,1
        addi  a1,a1,-1
        bne   a1,x0,send_buffer
        jalr  x0,0(ra)

The STAT.TXR bit will be clear while the shift register is busy; otherwise, it will be set.


RXST exists in the software emulator E2, and has the following definition.

    +1   .......1   R  RXV  RXINP holds valid data

STAT register replaces RXST in the hardware implementation of the SIA, and extends RXST with status information for both the receiver and the transmitter.

    +1   .......1   R  RXV  RXINP holds valid data
         ......1.   R  RXO  RXINP overrun
         .....1..   R  TXR  TXOUT ready for another byte
         ....1...   R  RXI  Receiver is currently idle
         ...1....   R  RXF  Receiver frame error detected
         ..0.....   R  ---
         .1......   R  RX0  Start bit (should always be 0)
         1.......   R  RX9  Stop bit (should always be 1)

See RXINP section below for an example of how to use this register to poll for new data.

All other bits are undefined, and are read back as 0.


    +2   xxxxxxxx   R

This register holds the most recently received byte.

If another byte is being received, this register's contents will remain stable until the stop-bit of the new byte is received.

When a hart reads this register. the valid data and overrun flags will clear automatically.

If a byte has been received while the valid data is already set (that is, a new byte has arrived before a host could receive the old byte), the overrun flag will be set.

The following code illustrates how to read a byte of data from the SIA:

        ; On entry: A0 -> SIA core
        ; On exit: A0 = byte retrieved.
        ; Destroyed: T0

        lb    t0,1(a0)
        andi  t0,t0,1
        beq   t0,x0,read_byte
        lb    a0,2(a0)
        jalr  x0,0(ra)


INTENA register contains a number of bits which can be used to mask reasons for interrupt.

    +1   .......1   R  EV  Interrupt when RXINP holds valid data
         ......1.   R  EO  Interrupt when RXINP is overrun.
         .....1..   R  ER  Interrupt when TXOUT is ready for another byte
         ....1...   R  EI  Interrupt when the receiver falls idle
         ...1....   R  EF  Interrupt if a frame error is detected


    +4   ........ ....xxxx xxxxxxxx xxxxxxxx   RW  Baud Rate Divisor
         ..x..... ........ ........ ........   RW  RXD Loopback Control
         xx...... ........ ........ ........   RW  TXD Loopback Control

This register holds the divisor for baud rate generation and some control bits that affects the operation of the TXD and RXD pins. These control bits should be written as 0 for normal operation.


The data rate which the SIA uses to communicate with a remote device is calculated according to this formula:

                           100 000 000 MHz
    data_rate (bits/sec) = ---------------
                              BAUD + 1

More useful to software, perhaps, is to solve for BAUD given a preferred data rate:

                  100 000 000 MHz
    BAUD = floor( --------------- ) -  1
                  data_rate (bps)

For example, to communicate with another device at 9600 bps, you'd set BAUD to 10415.

Because the divisor is only 20 bits wide, the slowest data rate is approximately 95 bits per second. Setting the divisor to zero will yield a 100Mbps transmission speed. The SIA, however, will not be able to receive at this speed. The fastest practical, bidirectional speed will come to around 12.5Mbps or, under more tightly controlled conditions, 25Mbps.

Loopback Controls

The top two bits instructs the SIA how to handle the TXD output pin, according to the following table:

BAUD[31] BAUD[30] Meaning
0 0 The serial output shift register drives the TXD pin directly. This is the normal behavior for a UART.
0 1 The TXD output is tied to the RXD input, creating a remote loopback configuration.
1 0 The TXD output is forced to 0. This is typically used to transmit a break signal.
1 1 The TXD output is forced to 1.

When configured for remote loopback, any data received from a remote device is echoed, bit for bit, back to that device. Any data sent by the local device is ignored and never makes it to the output. This is very helpful for testing a serial connection at the remote end.

NOTE: Unless configured otherwise, the data received from the remote device still makes it to the receiver! This allows, for instance, the remote device to instruct the local device when to switch back to normal operation.

BAUD[29] controls the local loopback configuration. If set to 0, the RXD input is tied to the receiver's input shift register. This is the normal behavior for a UART.

If BAUD[29] is 1, however, the input shift register is tied to the transmitter's shift register output. Any signals appearing on the RXD input pin is ignored by the SIA (except for remote loopback configuration; see above).

Local loopback mode allows the software of the local machine to test if the SIA exists and works, and should not be used for any other purpose.

NOTE: The SIA does not currently implement a receiver FIFO. A future version of the SIA may support a FIFO on the receiving path. With clever use of loopback mode, you can measure the depth of the FIFO as well.