reduce-historical: conservative.txt at [d58ccc1261]

File r38/lisp/csl/cslbase/conservative.txt artifact 724f80a036 part of check-in d58ccc1261

                     Conservative Garbage Collection
                     ===============================


This note is to help me while I design and implement one, and comments on
changes from a previous world.

(a) I can do an existing-style garbage collection during PRESERVE since
at that stage there ought not to be any stack to mark. Thus I can guarantee
that saved heaps will be fully compacted. Well in such cases I need just
a bit more - I should ensure that the new space that I copy to does not
contain any items that had been relocatable during the previous GC. If I am
tight on space and this is not easy then I may need to do two garbage
collections to guarantee that I end up with a fully compacted heap to save.

(b) For conservative GC I should abandon the old mark-slide compacting
collector and only ever use stop-and-copy. The effect is that if you only
have tightly limited memory space you do not get as far. This could be held
to be bad for people who run on a 32-bit machine and want over 1 Gbyte of
active heap data. I propose to take the line that they need to use the
new conservative system on a 64-bit system. But I should keep the existing
world supported for some while as well (as a seperate executable, but
probably able to use the same image files) so that they are not utterly
messed about.

(c) The only ambiguous bases I need to fuss about are on the C stack. All
other refs can be taken as fully valid.

(d) The new GC strategy will be:
    (d.1) For each ambiguous base, look at content. If it does not refer to
          heap data that I had previously allocated ignore it. Otherwise
          tag that data as immovable. [for CONS cells, symbol-heads and
          I think that a mark bit in the second word might serve? For vectors
          I can mark the header. I need to be careful about ambiguity against
          states that arise normally during GC but that are odd! I can
          identify CONS and symbol validity by a reference into a typed page
          with a good offset and sane content if I full unused parts of
          pages with a value not legal in used data (eg some sort of SPID?).
          For lisp vectors I hope that the header word is a valid test for
          a good reference (again unused space in the page must not look
          like header data so needs clearing). For binary and mixed vectors
          I may need a per-page bitmap.] Set a flag for any page that has
          immovable data on it.
          Perhaps putting symbols and all sorts of vectors in a page with
          a bitmap to identify headers will be easier! And if I have one bit
          in the map per 32 bits in the page then this will give me the
          same scheme for 32 and 64-bit machines.  Hmmm simplification!
    (d.2) Note that I must not start copying ANYTHING until I have identified
          everything that must be treated as immovable. Items tagged immovable
          must never move! Run the normal copying GC except that when an
          immovable item is reached it is just left in place. At some stage
          re-scan the ambiguous bases and copy items that are referred to from
          immovable items. Note that this means that the identification of
          valid ambigious pointers has to happen during copying as well as
          before, and this may add further limits to how tag bits must be
          used. Any immovable item must have its contents copied exactly
          once, so I need to be able to mark as "fixed" and also as "fixed
          but processed".
    (d.3) Any page that was NOT flagged as holding immovable data can be
          put in a general free pool at once. It probably does not need
          cleaning (but when a page is taken from a pool ready for use it
          probably does). Pages that contain immovable stuff haveto remain the
          same type. They have to have a linear scan to clean the parts no
          longer in use while preserving the valid stuff, and to set up a
          free-map for subsequent allocation. Shuffle things so that
          after the GC allocation of that type of data happens first from
          dirty pages.
    (d.4) Trigger the next GC while there are still enough free pages
          (including dirty ones, that should be accounted at a discount rate)
          that one can be confident of being able to complete the copy.
          In all realistic cases in fact after GC only a small proportion of
          potential heap is in use, and so usually the "half" as in half-space
          is way to pessimistic! Observe that there can be unrelocatable
          items pre-allocated in otherwise free pages.
    (d.5) Would it be worth setting up a has table of ambiguous pointers
          early in the GC to avoid need for any costly analysis to be
          repeated on identical values? If so how big would that table need
          to be and how could one cope with overflow?
    (d.6) A dirty page has a (few?) immovable items in it. Treat it as
          a sequence of minipages each of which is a consecutive free
          block of memory, and expect these to be large. Then allocation of
          new stuff can be sequential but can stop at the end of the
          minipage and skip over the unrelocatable item then. My guess is that
          the effect will be low extra costs compared to the current scheme.

Tagging during the GC

m..z000    reference to a cons cell
x..m001    a fixnum
x..m010    "odds". Bottom 8 bits indicates exact type, eg header
x..m011    short float
m..z100    symbol
m..z101    "numbers" eg pointer to bignum or complex. 1..x1xxx bits as usual
m..z110    vector of some sort
m..z111    regular boxed float

m is the GC bit used to mark this cell. z must be zero on a 64-bit machine
but may be 1 on a 32-bit one. But for symbols, numbers, vectors and boxfloats
I can use a bitmap to decide if I am pointing at a valid header.
For cons cells I will need to check that the CAR field does not contain
a value that flags the item as not in use (some sort of SPID I expect).

During the GC objects themselves get marked. The situation will be different
for each sort of object.  I will use the notation x for a cell that is
not tagged as above, and X for one that is.

Cons cells: Before GC a cons cell is xx (obviously!). When the cell has been
copied it is left as Xx with the X a forwarding address, which is necessarily
a reference to a cons. The cell is initially copied into the new space
unaltered as a fresh xx. Cons cells are allocated downwards from the top of
the new heap page. When a page is finally filled a tag SPID_GCMARK is
placed at its end. Newly copied stuff is scanned so that sub-structure gets
copied in due course, but that does not involve any extra tag or mark bits.
Ambigouos refs to cons cells can be handled as follows. An initial pass
of the ambiguous bases sets each cons that it finds to xX. Once all ambiguous
bases have been pre-tagged a second scan of them can call copy on the two
fields, leaving them tagged xX. At the end of GC a final scan resets
unrelocated items back to xx. Wheneverthe copy process finds a cons it of
course checks to see if it is Xx and so has been copied already - but in this
new case it also needs to check for xX to see if it shoudd be left unmoved.
Note - I could use XX not xX to tag an immovable cons, and just what I do
can be decided on the basis of looking to see which branch of the code would
be the cheaper one to put the test for ?X into.

Odds/BPS: one special case off otherwise immediate data needs attention. A
reference to (bytecoded) binary code is packed in a curious way. However its
treatment will be parallel to that for strings and bignums. An oddity here is
that the packing used for BPS references puts a limit on the amount of
bytecoded stuff that can be addressed, but I take the view that the bytecode
representation is so compact that that would never be an awkward limit in
any even vaguely realistic situation. And if it was I could re-work it for
64-bit machines to in effect remove the limit.

Fixnum, odds, short-float: the GC never changes these anyway and so for the
purposes of ambiguous bases they just get treated as "not a pointer" and hence
"nothing to worry about". Note however the BPS issue as above.

Vectors that only contain binary data: this includes strings and bignums
and boxed floats, and so it is actually a quite common case. And observe that
bytecoded things fall in this category too. Between garbage collections a
vector has a header word at its start. The pre-pass for ambiguous pointers can
just set the mark bit in the header word. Ie the header will be of the
for x...1010. Normally when the copy phase sees a vector it will write
a reference to the relocated vector into that header word. It writes it
without any special tag bits, and the presense of a non-header identifies
the object as having been visited. A past-pass on ambiguous pointers must
reset the marked headers.


Symbols, vectors that contain lists, mixed vectors: these actually seem to
be handled just the same way as vectors containing binary, except that there
must be a scan of ambiguous pointers that copies all vector content.

So in top-level outline there are 3 scans of ambiguous bases:
(1) tag addresses as immovable;
(2) copy contents of whatever the immovable thing was;
<at this stage one then copies startin from all the safe list bases>
(3) untag immovable data.
and I think that this means that I need to have unambiguous lists of
ambiguous pointers(!). If I just walk down the stack I might be twitchy
that the walk could cover some part of the stack that was in use by the
GC itself and so could change as I went. To avoid that as well as to
(slightly) speed things up I will walk the stack from the top down, and to
start with I will test if a value seems like a pointer and if so stick it in
a fixed size array. If I am lucky this fixed array will end up holding all
values that I need to trace! And I can use if in each of (1), (2) and (3).
If I am unlucky my stack will be very deep and many of the values on it
will appear to be Lisp values, and so I will completely fill the fixed
table. If I do I can save the pointer into the stack where the first value
beyond that that seemed pointerish lived, and do a the pointer validation
on each item from there to the stack base during each of my 3 phases.
Observation of how Reduce runs suggests that if my table were to hold
say 4000 entries it would almost always suffice, and 2000 would do nearly as
well. It is possible that a table of size 2000 set up as a hash table so
that duplicate values only get stored once would be a yet better idea.