Free Hero Mesh

codepage.doc at [e4569f54cc]
Login
This is a mirror of the main repository for Free Hero Mesh. New tickets and changes will not be accepted at this mirror.

File codepage.doc artifact 9ad890221b part of check-in e4569f54cc



=== Code page numbers ===

Code page numbers are assigned according to the following:

* Code page numbers are 23-bit numbers. Numbers 0, 65534, and 65535 are
not used; they are reserved for internal use of the software.

* Mainly, IBM code page numbers are used. (Note, these are not all the
same as Microsoft code page numbers.)

(Even if multibyte code pages are implemented in future (as well as the
possibility of different font sizes than 8x8, and for narrow/wide
characters), Shift-JIS is not suitable for Free Hero Mesh, because Free
Hero Mesh uses a backslash for string escaping and a multibyte character
in Shift-JIS can contain the ASCII code for a backslash. However, EUC-JP
may be suitable.)


=== List of code page numbers ===

In the below list, "=" means that it is already included in the provided
codepage.har file, and "-" means it isn't included (it might or might not
be included in future).

   367 = ASCII only
   437 = default PC character set
   850 = PC-like Western Latin
   852 - PC-like Central European
   858 - PC-like Western Latin including Euro currency
   860 - PC-like Portuguese
   861 - PC-like Icelandic
   865 = PC-like Nordic
   866 = DOS Cyrillic Russian
  1041 - Katakana (Japanese)
  1125 = DOS Cyrillic Ukrainian/Belarusian
  1250 - Windows Central Europe
  1251 - Windows Cyrillic
  1252 - Windows Western
  1254 - Windows Turkish

(Note: Code page 437 is hard-coded and is compiled into the executable
file, and if you specify code page 367 then it will automatically change
it to 437 (since code page 437 is a superset of code page 367).)


=== File format ===

The codepage.har file is a Hamster archive containing one lump for each
code page, named by the code page number in decimal. Code page 437 need
not be included in this file; pcfont.h is used instead.

The data of the lump can be one of three formats:

* Full set: Consists of 2048 bytes.

* Half set: Consists of 1024 bytes, with only the high half of the
character set. The low half will be the same as the built-in font.

* Compressed set: Any length less than 2048 bytes except 1024 bytes. See
below section for the description of this format.

In all cases, character code 0 is not used; if it contains anything, it
will be erased after it is loaded.


=== Compressed set format ===

The first 32 bytes indicate which characters differ from the definitions
in pcfont.h. The first byte is for character codes 0-7, where the low bit
means character code 0.

For each character whose bit is set in the header, the first byte tells
which scanlines to be copied from the base character, the second byte
tells the base character to use, and the rest of the bytes are one per
scanline that is changed, with the data to be replaced with.

The second byte is omitted if no scanlines are copied from the base
character (the first byte is zero). If it is less than the current
character code, it is copied from the current code page; if equal or
greater than the current character code, it is copied from the default
code page (pcfont.h).