/* doc.h Copyright (C) 1991-4 Codemist Ltd */ /* Signature: 440bda8c 08-Dec-1995 */ Although this file is stored with suffix ".h" it is not intended to be compiled. It is just documentation about CSL and its history. ------------------------------------------------------------------------------- April 95: Introduce vectors containing binary. mkvect8/mkvect16/mkvect32/fmkvect32/fmkvect64 to hold 8, 16 and 32 bit (unboxed) integers and 32/64 bit (unboxed) floats. Access via putv8/getv8 putv16/getv16 putv32/getv32 fputv32/fgetv32 fputv64/fgetv64. NOT integrated with other (generic) vectore handling functions. double-execute() and undouble-execute() are like trace() and untrace() but cause the nominated functions to double their costs. Only at first actually does anything for functions with fixed nuymbers of args. November 94: Introduction of "mixed" vectors. October/November 94: Adjustment of various source files to support Watcom C 10.0 as well as Zortech, GCC and Microsoft foir use on 80x86. May 30 1994: In COMMON Lisp mode the compiler now supports FLET, LABELS and PROGV. Interpreted PROGV mended. May 20 1994: Bug fixed in preserve.c that meant that on 16 bit machines compressing a heap image could loop if a suitable segment boundary in the file fell on a 64Kbyte boundary! Some attempt to make bytestream_interpret more compact by moving some fragments of it out of line. May 18 1994: PRESERVE now (really) has optional args. First specifies a restart function to be called, the second is a banner to display when the image is reloaded. Reduce now uses this mechanism to gets its banner displayed. Lexical closures supported in the bytecode compiler (for variable reference but not GO or RETURN-FROM) and 3 new opcodes (closure, loadlex, storelex) introduced in bytes1.c to support same. May 16 1994: By default the MSB of pointers is now determined at run time (rather than being a manifest value compiled into the code). Exception conditions changed to set the 1 bit of nil rather than the 0x80000000 bit. More macros used to test and set exception conditions. 286 version uses software ticks. Seems as if real asynchronous ticks are needed most on just those machines least liable to support them reliably. Output column control used to be muddled between stdout (eg trace) output and regfular user printing. Now improved. Better build procedures for patches and updated help files for REDUCE, and initial banner for Reduce displayed earlier. May 1 1994: Support for calls to functions with large numbers of arguments is improved. In particular in Common Lisp mode it is possible to have more than 50 args. Line-length overflow checking for printing of big numbers implemented. binary printing for numbers, and case fold up as well as down, supported. Demonstration version of Reduce brought back into step with rest of system. car and filesign utilities scan files in alphabetic order under DOS and NT. 'demo put in lispsytem!* list if help function available. build files for Reduce include help system and patches, and can build demonstration system correctly. ------------------------------------------------------------------------------- DATA REPRESENTATION =================== All objects in CSL are represented by 32-bit words, and each object in memory is aligned at an 8-byte boundary. The bottom 3 bits of a word aare used as tags, and indicate how the rest of the word should be interpreted. In some cases the rest of the word is just direct information (eg small integers are handled this way) and then the 0x8-bit is used by the garbage collecter. In other cases the 32-bit word contains the address in memory of some object, and in that case its most significant bit is reserved for use by the garbage collector. Thus all valid addresses must have the same sign (but this can be either + or -). Many objects in memory have a header word at their start that indicates their length and gives more details about their type. The encoding of object lengths means that the largest possible single object in CSL would be 4 Mbytes long. The coding in the low bits of a word is as follows: x x x | x 0 0 0 cons cell (and NIL in Common Lisp mode) x x x | G 0 0 1 28-bit integer (G = garbage collector bit) x x x | G 0 1 0 characters, vec-hdrs, other oddities (immediate) x x x | G 0 1 1 28-bit short float if Common Lisp mode x x x | x 1 0 0 a symbol x x x | x 1 0 1 bignum. {rational number, complex number} x x x | x 1 1 0 Lisp vector (includes strings etc) x x x | x 1 1 1 pointer to boxed floating point number Pointers reserve the 0x80000000L bit for the garbage collector, assuming that all addresses are in the same half of the space. This of course is a slightly dodgy assumption that may be false on some computers. Header words in the vector heap If the bottom 3 bits of a word are 010 (TAG_ODDS) the word is treated as immediate data. In that case the 1000 bit (GC_BIT_I) is kept free for use as a mark bit. The next bits up classify the object: | x x 0 0 | * 0 1 0 valid immediate date in heap but can also be used as vector headers | x x 0 1 | * 0 1 0 symbol headers | x x 1 0 | * 0 1 0 number headers | x x 1 1 | * 0 1 0 vector headers Data that could be in the heap or on the stack decodes further: x x | 0 0 0 0 | * 0 1 0 character x x | 0 1 0 0 | * 0 1 0 handle/offset to BPS page x x | 1 0 0 0 | * 0 1 0 [handle/offset to literal vector] x x | 1 1 0 0 | * 0 1 0 special marker (SPID) left on stack to help interpreter The remaining 24 bits of the word are available for data. The BPS and literal vector handles in this case are for a segmented memory implementation, and the 24 bit address has its top 10 bits identifying a page number, then 14 bits giving a WORD offset into the associated page, which can thus be up to 64 Kbytes large. Or more depending on PAGE_BITS. Literal vectors done this way are not used at present. For Standard Lisp only 8 bits of character information is needed - the remaining bits are there to support font etc information in a Common Lisp world. . . . . | g f 0 1 | * 0 1 0 symbol header g global variable f fluid (special) variable The remaining bits in the word are used as follows 00000100 symbol names a special form 00000200 symbol has a definition as a macro 00000400 an unprinted gensym (so print-name is not complete yet) 00000800 has a definition in C from the C-coded kernel 00001000 just carries a code-pointer (codep test) 00002000 any gensym, printed or not 000fc000 fastget code in this is used as a p-list indicator 00100000 traced 00200000 is external in its home package ffc00000 reachability in first ten packages The remaining header words hold the length (in bytes) of the object (including the length of the header word) in the upper 22 bits of the word. This puts a limit of 4 Mbytes on the largest possible single object in this Lisp. 0 0 | 0 0 1 0 | * 0 1 0 bignum header word 0 0 | 0 1 1 0 | * 0 1 0 ratnum 0 0 | 1 0 1 0 | * 0 1 0 complex number 0 0 | 1 1 1 0 | * 0 1 0 NOT USED (non-float number) 0 1 | 0 0 1 0 | * 0 1 0 single float 0 1 | 0 1 1 0 | * 0 1 0 double float 0 1 | 1 0 1 0 | * 0 1 0 long float 0 1 | 1 1 1 0 | * 0 1 0 NOT USED (floating number) Note that headers for numbers are 0x|xx10|x010 and the 0x100 bit flags floating point cases n n | n 0 1 1 | * 0 1 0 bitvector with nnn bits in last byte 0 0 | 0 1 1 1 | * 0 1 0 string 0 0 | 1 0 1 1 | * 0 1 0 (bitvector) 0 0 | 1 1 1 1 | * 0 1 0 simple vector 0 1 | 0 0 1 1 | * 0 1 0 (bitvector) 0 1 | 0 1 1 1 | * 0 1 0 BPS 0 1 | 1 0 1 1 | * 0 1 0 (bitvector) 0 1 | 1 1 1 1 | * 0 1 0 hash table 1 0 | 0 0 1 1 | * 0 1 0 (bitvector) 1 0 | 0 1 1 1 | * 0 1 0 SPARE contains binary data 1 0 | 1 0 1 1 | * 0 1 0 (bitvector) 1 0 | 1 1 1 1 | * 0 1 0 Common Lisp array (header block) 1 1 | 0 0 1 1 | * 0 1 0 (bitvector) 1 1 | 0 1 1 1 | * 0 1 0 encapsulated stack pointer 1 1 | 1 0 1 1 | * 0 1 0 (bitvector) 1 1 | 1 1 1 1 | * 0 1 0 Common Lisp structure (use BPS for vector of 8-bit ints) 1 0 | 0 0 1 0 | * 0 1 0 vector of 16-bit integers 1 0 | 0 1 1 0 | * 0 1 0 vector of 32-bit integers 1 0 | 1 0 1 0 | * 0 1 0 MIXED1 [3 words of pointers, then binary] 1 0 | 1 1 1 0 | * 0 1 0 MIXED2 1 1 | 0 0 1 0 | * 0 1 0 vector of single-precision floats 1 1 | 0 1 1 0 | * 0 1 0 vector of double-precision floats 1 1 | 1 0 1 0 | * 0 1 0 MIXED3 1 1 | 1 1 1 0 | * 0 1 0 Stream handle (aka MIXED4) ------------------------------------------------------------------------------- /* end of doc.h */