File psl-1983/3-1/lpt/22-parser.lpt artifact 5482c246b1 part of check-in e1a8550313


PSL Manual                    7 February 1983                  Parser Tools
section 22.0                                                      page 22.1

                                CHAPTER 22
                                CHAPTER 22
                                CHAPTER 22
                               PARSER TOOLS
                               PARSER TOOLS
                               PARSER TOOLS




     22.1. Introduction .  .  .  .  .  .  .  .  .  .  .  .  .  .  .    22.1
     22.2. The Table Driven Parser  .  .  .  .  .  .  .  .  .  .  .    22.2
          22.2.1. Flow Diagram for the Parser.  .  .  .  .  .  .  .    22.2
          22.2.2. Associating the Infix Operator with a Function  .    22.4
          22.2.3. Precedences .  .  .  .  .  .  .  .  .  .  .  .  .    22.5
          22.2.4. Special Cases of 0 <-0 and 0 0.  .  .  .  .  .  .    22.5
          22.2.5. Parenthesized Expressions  .  .  .  .  .  .  .  .    22.5
          22.2.6. Binary Operators in General.  .  .  .  .  .  .  .    22.6
          22.2.7. Assigning Precedences to Key Words  .  .  .  .  .    22.7
          22.2.8. Error Handling .  .  .  .  .  .  .  .  .  .  .  .    22.7
          22.2.9. The Parser Program for the RLISP Language .  .  .    22.7
          22.2.10. Defining Operators  .  .  .  .  .  .  .  .  .  .    22.8
     22.3. The MINI Translator Writing System.  .  .  .  .  .  .  .   22.10
          22.3.1. A Brief Guide to MINI.  .  .  .  .  .  .  .  .  .   22.10
          22.3.2. Pattern Matching Rules  .  .  .  .  .  .  .  .  .   22.12
          22.3.3. A Small Example.  .  .  .  .  .  .  .  .  .  .  .   22.12
          22.3.4. Loading Mini.  .  .  .  .  .  .  .  .  .  .  .  .   22.13
          22.3.5. Running Mini.  .  .  .  .  .  .  .  .  .  .  .  .   22.13
          22.3.6. MINI Error messages and Error Recovery .  .  .  .   22.13
          22.3.7. MINI Self-Definition .  .  .  .  .  .  .  .  .  .   22.13
          22.3.8. The Construction of MINI.  .  .  .  .  .  .  .  .   22.15
          22.3.9. History of MINI Development.  .  .  .  .  .  .  .   22.16
     22.4. BNF Description of RLISP Using MINI  .  .  .  .  .  .  .   22.17




22.1. Introduction
22.1. Introduction
22.1. Introduction

  In   many   applications,   it   is   convenient   to  define  a  special
"problem-oriented" language, tailored to provide a  natural  input  format.
Examples  include the RLISP ALGOL-like surface language for algebraic work,
graphics languages, boolean query languages for data-base,  etc.    Another
                                                  ________
important  case  is  the  requirement  to  accept existing programs in some
language, either to translate them  to  another  language,  to  compile  to
machine  language,  to  be  able  to  adapt  existing  code  into  the  PSL
environment (e.g. mathematical libraries, etc.), or because we wish to  use
PSL  based  tools  to  analyze  a program written in another language.  One
approach is to  hand-code  a  program  in  PSL  (called  a  "parser")  that
translates  the  input  language  to  the desired form; this is tedious and
error prone, and it is more convenient to use a "parser-writing-tool".

  In this Chapter we describe in detail two important parser writing  tools
available  to the PSL programmer: an extensible table-driven parser that is
used for the RLISP parser (described in Chapter 3),  and  the  MINI  parser
generator.    The table-driven parser is most useful for languages that are
Parser Tools                  7 February 1983                    PSL Manual
page 22.2                                                      section 22.1

simple  extensions  of  RLISP,  or in fact for rapidly adding new syntactic
constructs to RLISP.  The MINI system is used for the development  of  more
complete user languages.



22.2. The Table Driven Parser
22.2. The Table Driven Parser
22.2. The Table Driven Parser

  The  parser is a top-down recursive descent parser, which uses a table of
___________
Precedences to control the parse; if numeric precedence  is  not  adequate,
LISP functions may be inserted into the table to provide more control.  The
parser  described  here  was  developed by Nordstrom [Nordstrom 73], and is
very similar to parser described by Pratt [Pratt 73], and  apparently  used
for the CGOL language, another LISP surface language.

                                                                Scan   Scan
                                                                Scan   Scan
  The parser reads tokens from an input stream using a function Scan.  Scan
            ChannelReadToken
            ChannelReadToken
calls  the  ChannelReadToken function described in Chapter 12, and performs
some additional checks, described below.  Each token is defined to  be  one
of the following:


                    non-operator          O
                    right operator        O->
                    binary operator     <-O->


  All  combinations  of . . .O-> O. . . and O <-O->. . . are supposed to be
legal, while the  combinations  . . .O-> <-O->. . .,  . . .<-O-> <-O->. . .
and  O O. . . are normally illegal (error ARG MISSING and error OP MISSING,
respectively).

                                       __
  With each operator (which must be an id)  is  associated  a  construction
function, a right precedence, and for binary operators, a left precedence.

  The  Unary  Prefix  operators  have  this  information  stored  under the
indicator  'RLISPPREFIX  and  Binary  operators  have   it   stored   under
'RLISPINFIX.    (Actually, the indicator used at any time during parsing is
the VALUE of GRAMPREFIX or GRAMINFIX, which may be changed by the user).


22.2.1. Flow Diagram for the Parser
22.2.1. Flow Diagram for the Parser
22.2.1. Flow Diagram for the Parser

  In this diagram RP stands for Right Precedence, LP  for  Left  Precedence
and  CF for Construction Function.  OP is a global variable which holds the
current token.
PSL Manual                    7 February 1983                  Parser Tools
section 22.2                                                      page 22.3


     procedure PARSE(RP);
      RDRIGHT(RP,SCAN()); % SCAN reads next token

                 RDRIGHT(RP,Y)
                       |
                      \|/
                       |
            ------------------------
            |                      |yes
            |      Y is Right OP   |-----> Y:=APPLY(Y.CF,
            |                      |                RDRIGHT(Y.RP));
            ------------------------                .
                       |                            .
                      \|/ no                        .
                       |                            .
            ------------------------                .
ERROR    yes|                      | no             .
ARG    <----|      Y is Binary OP  |----> OP:=      .
MISSING     |                      |       SCAN();  .
            ------------------------           .    .
                       |--------<------------<------*
            RDLEFT:   \|/                           ^
                       |                            ^
            ------------------------                ^
ERROR     no|                      |                ^
 OP    <----|    OP is Binary      |                ^
MISSING     |                      |                ^
            ------------------------                ^
                       |                            ^
                      \|/  yes                      ^
                       |                            ^
            ------------------------                ^
RETURN   yes|                      |no              ^
 (Y)   <----|   RP > OP.lp         |---> Y:=APPLY(OP.cf,Y,
            ------------------------       PARSE(OP.lp,SCAN());
Parser Tools                  7 February 1983                    PSL Manual
page 22.4                                                      section 22.2

  This  diagram  reflects the major behavior, though some trivial additions
are included in the RLISP case to handle cases such as OP-> <-OP, '!;, etc.
[See PU:RLISP-PARSER.RED for full details.]

  The technique involved may also be described by the following figure:


                           . . . 0-> Y <-0 . . .
                                     rp lp


  Y is a token or an already parsed expression between  two  operators  (as
indicated).    If 0->'s RP is greater than <-0's LP, then 0-> is the winner
and Y goes to 0->'s construction function (and vice  versa).    The  result
from the construction function is a "new Y" in another parse situation.

  By associating precedences and construction functions with the operators,
we are now able to parse arithmetic expressions (except for function calls)
and  a  large  number of syntactical constructions such as IF - THEN - ELSE
- ; etc.  The following discussion of how to expand the parser to  cover  a
language  such  as  RLISP  (or ALGOL) may also be seen as general tools for
handling the parser and defining construction functions and precedences.


22.2.2. Associating the Infix Operator with a Function
22.2.2. Associating the Infix Operator with a Function
22.2.2. Associating the Infix Operator with a Function

      Scan                RAtomHook
      Scan                RAtomHook         __              __
  The Scan, after calling RAtomHook, checks ids and special ids (those with
TOKTYPE!* = 3) to see if they should  be  renamed  from  external  form  to
                             Plus2
                             Plus2
internal  form  (e.g. '!+ to Plus2).  This is done by checking for a NEWNAM
                              __               __
or NEWNAM!-OP property on the id.  For special ids, the NEWNAM!-OP property
is first checked.  The value of the property is a replacement token, i.e.  


PUT('!+,'NEWNAM!-OP,'PLUS2)


has been done.

  Scan                                  RlispRead
  Scan                                  RlispRead
  Scan also handles the ' mark, calling RlispRead to get the  S-expression.
RlispRead                           Read
RlispRead                           Read
RlispRead   is   a   version   of   Read,   using   a   special  SCANTABLE,
RLISPREADSCANTABLE!*.

               Scan
               Scan
  The function Scan also sets SEMIC!* to '!; or '!$ if CURSYM!* is detected
to be '!*SEMICOL!* (the internal name for '!; and "!$).  This controls  the
RLISP  echo/no-echo  capability.  Finally, if the renamed token is 'COMMENT
                    ReadCh
                    ReadCh
then characters are ReadCh'd until a '!; or '!$ .
PSL Manual                    7 February 1983                  Parser Tools
section 22.2                                                      page 22.5

22.2.3. Precedences
22.2.3. Precedences
22.2.3. Precedences

  To  set up precedences, it is often helpful to set up a precedence matrix
of the operators involved.  If  any  operator  has  one  "precedence"  with
respect to one particular operator and another "precedence" with respect to
some  other,  it  is  sometimes  not  possible  to run the parser with just
numbered precedences for the operators without introducing ambiguities.  If
this is the case, replace the number RP by the operator RP  and  test  with
something like:


                         IF RP *GREATER* OP . . .


*GREATER*  may  check in the precedence matrix.  An example in which such a
scheme might be used is the case for which ALGOL uses ":"  both as a  label
marker  and  as  an index separator (although in this case there is no need
for the change above).  It is also a good policy to have even  numbers  for
right precedences and odd numbers for left precedences (or vice versa).


22.2.4. Special Cases of 0 <-0 and 0 0
22.2.4. Special Cases of 0 <-0 and 0 0
22.2.4. Special Cases of 0 <-0 and 0 0

  If  . . .0 0. . .  is  a  legal  case  (i.e. F A may translate to (F A)),
ERROR OP MISSING is replaced by:


                Y:=REPCOM(Y,RDRIGHT(99,OP)); GO TO RDLEFT;


The value 99 is chosen in order to have the first object (F)  behave  as  a
right  operator  with  maximum precedence.  If . . .0 <-0. . . is legal for
some combinations of operators, replace  ERROR  ARG  MISSING  by  something
equivalent to the illegal RLISP statement:


IF ISOPOP(OP,RP,Y)
         THEN <<OP:=Y;
                Y:=(something else, i.e. NIL);
                GOTO RDLEFT>>
       ELSE ERROR ARG MISSING;


ISOPOP is supposed to return T if the present situation is legal.


22.2.5. Parenthesized Expressions
22.2.5. Parenthesized Expressions
22.2.5. Parenthesized Expressions


                       (a) is to be translated to a.

                                   E.g.
Parser Tools                  7 February 1983                    PSL Manual
page 22.6                                                      section 22.2

                    BEGIN a END translates to (PROG a).


  Define  "("  and  BEGIN as right operators with low precedences (2 and -2
respectively).  Also define ")" and END as binary operators  with  matching
left  precedences  (1 and -3 respectively).  The construction functions for
"(" and BEGIN are then something like:  [See pu:RLISP-PARSER.RED for  exact
details on ParseBEGIN]


BEGIN     (X);PROG2(OP:=SCAN();MAKEPROG(X));
"("       (X);PROG2(IF OP=') THEN OP:=SCAN()
                                  ELSE ERROR, x);


  Note that the construction functions in these cases have to read the next
token;  that  is the effect of ")" closing the last "(" and not all earlier
"("'s.  This is also an example of binary operators declared only  for  the
purpose of having a left precedence.


22.2.6. Binary Operators in General
22.2.6. Binary Operators in General
22.2.6. Binary Operators in General

  As almost all binary operators have a construction function like


                               LIST(OP,X,Y);


it  is  assumed to be of that kind if no other is given.  If OP is a binary
operator, then "a OP b OP c" is interpreted as "(a OP b) OP c" only if OP's
LP is less than OP's RP.

  Example:


                    A + B + C translates to (A + B) + C
                          because +'RP = 20 and +'LP = 19

                    A ^ B ^ C translates to A ^ (B ^ C)
                          because ^'RP = 20 and ^'LP = 21


  If you want some operators to translate to n-ary expressions, you have to
define a proper construction function for that operator.

  Example:  


PLUS   (X,Y); IF CAR(X) = 'PLUS THEN NCONC(X,LIST(Y))
                              ELSE LIST('PLUS,X,Y);
PSL Manual                    7 February 1983                  Parser Tools
section 22.2                                                      page 22.7

  By  defining  ","  and  ";"  as  ordinary  binary  operators,  the parser
automatically takes care  of  constructions  like  . . .e,e,e,e,e. . .  and
. . . stm;stm;stm;stm;. . .    It  is  then  up  to some other operators to
remove the "," or the ";" from the parsed result.


22.2.7. Assigning Precedences to Key Words
22.2.7. Assigning Precedences to Key Words
22.2.7. Assigning Precedences to Key Words

  If you want some operators to have control immediately, insert


                      IF RP = NIL THEN RETURN Y ELSE


as the very first test in RDRIGHT and set the right precedence of those  to
NIL.    This  is  sometimes useful for key-word expressions.  If entering a
construction function of such an operator, X is the token immediately after
the operator.  E.g.:  We want to parse PROCEDURE EQ(X,Y); .  .  .    Define
PROCEDURE  as  a  right  operator with NIL as precedence.  The construction
function for PROCEDURE can always call the parser and set the rest  of  the
expression.    Note  that if PROCEDURE was not defined as above, the parser
would misunderstand the expression in the case  of  EQ  as  declared  as  a
binary operator.


22.2.8. Error Handling
22.2.8. Error Handling
22.2.8. Error Handling

  For  the  present, if an error occurs a message is printed but no attempt
is made to correct or handle the error.  Mostly the parser goes wild for  a
while (until a left precedence less than current right precedence is found)
and then goes on as usual.


22.2.9. The Parser Program for the RLISP Language
22.2.9. The Parser Program for the RLISP Language
22.2.9. The Parser Program for the RLISP Language

  SCAN();

  The  purpose  of  this  function is to read the next token from the input
stream.  It uses the general purpose table driven token  scanner  described
in  Chapter  12,  with  a specially set up ReadTable, RLISPSCANTABLE!*.  As
                                                            Scan
                   __________                               Scan
RLISP has multiple identifiers  for  the  same  operators,  Scan  uses  the
following translation table:
                    =  EQUAL            >= GEQ
                    +  PLUS             >  GREATERP
                    -  DIFFERENCE       <= LEQ
                    /  QUOTIENT         <  LESSP
                    .  CONS             *  TIMES
                    := SETQ             ** EXPT


                     Scan
                     Scan
  In  these  cases,  Scan  returns the right hand side of the table values.
                                             Scan
                                             Scan
Also, two special cases are taken care of in Scan:
Parser Tools                  7 February 1983                    PSL Manual
page 22.8                                                      section 22.2

   a. '  is  the  QUOTE mark.  If a parenthesized expression follows '
      then the syntax within the parenthesis is that of LISP, using  a
      special  scan  table,  RLISPREADSCANTABLE!*.    The  only  major
      difference from ordinary LISP is that  !  is  required  for  all
      special characters.

   b. ! in RLISP means actually two things:


         i. the  following  symbol  is not treated as a special symbol
            (but belongs to the print name of the atom in process);

        ii. the atom created cannot be an operator.


  Example: !( in the text behaves as the atom "(".

  To signal to the parser that this is the case, the flag variable ESCAPEFL
must be set to T if this situation occurs.


22.2.10. Defining Operators
22.2.10. Defining Operators
22.2.10. Defining Operators

  To define operators use:


DEFINEROP(op,p{,stm});
          For right or prefix operators.

DEFINEBOP(op,lp,rp{,stm});
          For binary operators.


  These use the VALUE of DEFPREFIX and DEFINFIX to  store  the  precedences
and  construction  functions.    The  default  is  set  for  RLISP,  to  be
                                        __________
'RLISPPREFIX and 'RLISPINFIX.  The same identifier can be defined  both  as
the right and binary operator.  The context defines which one applies.

  Stm is the construction function.  If stm is omitted, the common defaults
are used:


LIST(OP,x)
          prefix     case,    x    is    parsed    expression    following,
          x=RDRIGHT(p,SCAN()).

LIST(OP,x,y)
          binary case, x is previously parsed expression, y  is  expression
          following, y=RDRIGHT(rp,SCAN()).


               __
  If stm is an id, it is assumed to be a procedure of one or two arguments,
PSL Manual                    7 February 1983                  Parser Tools
section 22.2                                                      page 22.9

for   "x"   or  "x,y".    If  it  is  an  expression,  it  is  embedded  as
(LAMBDA(X) stm) or (LAMBDA(X Y) stm), and should  refer  to  X  and  Y,  as
needed.

  Also  remember  that  the free variable OP holds the last token (normally
the binary operator which stopped the parser).  If  "p"  or  "rp"  is  NIL,
RDRIGHT  is  not called by default, so that only SCAN() (the next token) is
passed.


For example,

DEFINEBOP('DIFFERENCE,17,18);
        % Most common case, left associative, stm=LIST(OP,x,y);

DEFINEBOP('CONS,23,21);
        % Right Associative, default stm=LIST(OP,x,y)

DEFINEBOP('AND,11,12,ParseAND);
        % Left Associative, special function
    PROCEDURE ParseAND(X,Y);
       NARY('AND,X,Y);

DEFINEBOP('SETQ,7,6,ParseSETQ);
        % Right Associative, Special Function
    PROCEDURE ParseSETQ(LHS,RHS);
      LIST(IF ATOM LHS THEN 'SETQ ELSE 'SETF, LHS, RHS);

DEFINEROP('MINUS,26);    % default C-fn, just (list OP arg)

DEFINEROP('PLUS,26,ParsePLUS1); %

DEFINEROP('GO,NIL,ParseGO );
       % Special Function, DO NOT use default PARSE ahead
    PROCEDURE ParseGO X;   X is now JUST next-token
      IF X EQ 'TO THEN LIST('GO,PARSE0(6,T))
                % Explicit Parse ahead
           ELSE <<OP := SCAN(); % get Next Token
                  LIST('GO,X)>>;

DEFINEROP('GOTO,NIL,ParseGOTO );
        % Suppress Parse Ahead, just pass NextToken
   PROCEDURE ParseGOTO X;
     <<OP := SCAN();
       LIST('GO,X)>>;
Parser Tools                  7 February 1983                    PSL Manual
page 22.10                                                     section 22.3

22.3. The MINI Translator Writing System
22.3. The MINI Translator Writing System
22.3. The MINI Translator Writing System

  Note that MINI is now autoloading.


22.3.1. A Brief Guide to MINI
22.3.1. A Brief Guide to MINI
22.3.1. A Brief Guide to MINI

  The  following  is  a  brief introduction to MINI, the reader is referred
to [Marti 79] for a more detailed discussion of the  META/RLISP  operators,
which are very similar to those of MINI.

  The  MINI  system reads in a definition of a translator, using a BNF-like
form.  This is processed by MINI into a set of LISP functions, one for each
production, which make calls on each other, and a set of  support  routines
that  recognize  a  variety  of  simple  constructs.   MINI uses a stack to
perform parsing, and the user can access sub-trees already  on  the  stack,
replacing  them  by  other trees built from these sub-trees.  The primitive
                         __   _______
functions that recognize ids, integers, etc. each  place  their  recognized
token on this stack.

  For example,


  FOO: ID '!- ID +(PLUS2 #2 #1) ;


defines  a  rule FOO, which recognizes two identifiers separated by a minus
                                    __________
sign (each ID pushes the recognized identifier onto the stack).   The  last
expression  replaces  the top 2 elements on the stack (#2 pops the first ID
pushed onto the stack, while #1 pops the other) with a LISP statement.


 Id
 Id    _______                                                         ____
(Id ): boolean                                                         expr

                                __________
     See if current token is an identifier and not a keyword.   If  it
     is, then push onto the stack and fetch the next token.


 AnyId
 AnyId    _______                                                      ____
(AnyId ): boolean                                                      expr

                                __
     See if current token is an id whether or not it is a key word.


 AnyTok
 AnyTok    _______                                                     ____
(AnyTok ): boolean                                                     expr

     Always succeeds by pushing the current token onto the stack.


 Num
 Num    _______                                                        ____
(Num ): boolean                                                        expr

                                               ______
     Tests  to  see  if the current token is a number, if so it pushes
         ______
     the number onto the stack and fetches the next token.
PSL Manual                    7 February 1983                  Parser Tools
section 22.3                                                     page 22.11

 Str
 Str    _______                                                        ____
(Str ): boolean                                                        expr

             Num
             Num             ______
     Same as Num, except for strings.

  Specification of a parser using MINI consists of defining the syntax with
BNF-like  rules  and  semantics  with LISP expressions.  The following is a
brief list of the operators:


'         Used to designate a terminal symbol (i.e. 'WHILE, 'DO, '!=).

Identifier
          Specifies a nonterminal.

( )       Used for grouping (i.e. (FOO BAR)  requires  rule  FOO  to  parse
          followed immediately by BAR).

< >       Optional  parse,  if  it fails then continue (i.e. <FOO> tries to
          parse FOO).

/         Optional rules (i.e. FOO / BAR allows either FOO or BAR to parse,
          with FOO tested first).

STMT*     Parse any number of STMT.

STMT[ANYTOKEN]*
          Parse any number of STMT separated by ANYTOKEN, create a list and
                                                                __________
          push onto the stack (i.e. ID[,]* parses a number  of  identifiers
          separated by commas, like in an argument list).

                                                        _______
##n       Refer to the nth stack location (n must be an integer).

                                                   _______
#n        Pop the nth stack location (n must be an integer).

+(STMT)   Push the unevaluated (STMT) onto the stack.

.(SEXPR)  Evaluate the SEXPR and ignore the result.

=(SEXPR)  Evaluate the SEXPR and test if result non-NIL.

+.(SEXPR) Evaluate the SEXPR and push the result on the stack.

@ANYTOKEN Specifies  a  statement  terminator;  used  in the error recovery
          mechanism to search for the occurrence of errors.

@@ANYTOKEN
          Grammar terminator;  also  stops  scan,  but  if  encountered  in
          error-recovery, terminates grammar.
Parser Tools                  7 February 1983                    PSL Manual
page 22.12                                                     section 22.3

22.3.2. Pattern Matching Rules
22.3.2. Pattern Matching Rules
22.3.2. Pattern Matching Rules

  In addition to the BNF-like rules that define procedures with 0 arguments
and  which  scan  tokens  by calls on NEXT!-TOK() and operate on the stack,
MINI also includes a simple TREE  pattern  matcher  and  syntax  to  define
PatternProcedures that accept and return a single argument, trying a series
of patterns until one succeeds.


E.g.        template    ->  replacement

PATTERN = (PLUS2 &1 0) -> &1,
          (PLUS2 &1 &1) -> (LIST 'TIMES2 2 &1),
          &1            -> &1;


defines  a pattern with 3 rules.  &n is used to indicate a matched sub-tree
in both the template and replacement.  A repeated  &n,  as  in  the  second
               Equal
               Equal
rule, requires Equal sub-trees.


22.3.3. A Small Example
22.3.3. A Small Example
22.3.3. A Small Example


% A simple demo of MINI, to produce a LIST-NOTATION reader.
% INVOKE 'LSPLOOP reads S-expressions, separated by ;

mini 'lsploop;                  % Invoke MINI, give name of ROOT
                                % Comments can appear anywhere,
                                % prefix by % to end-of-line
lsploop:lsp* @@# ;              % @@# is GRAMMAR terminator
                                %  like '# but stops TOKEN SCAN
lsp:    sexp @;                 % @; is RULE terminator, like ';
        .(print #1)             %  but stops SCAN, to print
        .(next!-tok) ;          %  so call NEXT!-TOK() explicitly
sexp:   id / num / str / '( dotexp ') ;
dotexp: sexp* < '. sexp +.(attach #2 #1)  > ;
fin

symbolic procedure attach(x,y);
<<for each z in reverse x do y:=z . y; y>>;


22.3.4. Loading Mini
22.3.4. Loading Mini
22.3.4. Loading Mini

  MINI is loaded from PH: using LOAD MINI;.
PSL Manual                    7 February 1983                  Parser Tools
section 22.3                                                     page 22.13

22.3.5. Running Mini
22.3.5. Running Mini
22.3.5. Running Mini

                                          Invoke
                                          Invoke
  A  MINI  grammar  is  run  by  calling  Invoke  rootname;.  This installs
appropriate Key Words (stored on the property list of rootname), and  start
the grammar by calling the Rootname as first procedure.


22.3.6. MINI Error messages and Error Recovery
22.3.6. MINI Error messages and Error Recovery
22.3.6. MINI Error messages and Error Recovery

  If  MINI detects a non-fatal error, a message be printed, and the current
token and stack is shown.  MINI then  calls  NEXT!-TOK()  repeatedly  until
either a statement terminator (@ANYTOKEN) or grammar terminator (@ANYTOKEN)
is seen.  If a grammar terminator, the grammar is exited; otherwise parsing
resumes from the ROOT.

  [??? Interaction with BREAK loop rather poor at the moment ???]
  [??? Interaction with BREAK loop rather poor at the moment ???]
  [??? Interaction with BREAK loop rather poor at the moment ???]


22.3.7. MINI Self-Definition
22.3.7. MINI Self-Definition
22.3.7. MINI Self-Definition


% The following is the definition of the MINI meta system in terms of
% itself.  Some support procedures are needed, and exist in a
% separate file.
% To define a grammar, call the procedure MINI with the argument
% being the root rule name.   Then when the grammar is defined it may
% be called by using INVOKE root rule name.

%   The following is the MINI Meta self definition.

MINI 'RUL;

%   Define the diphthongs to be used in the grammar.
DIP: !#!#, !-!>, !+!., !@!@ ;

%   The root rule is called RUL.
RUL: ('DIP ': ANYTOK[,]* .(DIPBLD #1) '; /
     (ID  .(SETQ !#LABLIST!# NIL)
       ( ': ALT            +(DE #2 NIL #1) @; /
         '= PRUL[,]* @;    .(RULE!-DEFINE '(PUT(QUOTE ##2)(QUOTE RB)
                             (QUOTE #1)))
                           +(DE ##1 (A)
                             (RBMATCH A (GET (QUOTE #1) (QUOTE RB))
                                                               NIL)))
       .(RULE!-DEFINE #1) .(NEXT!-TOK) ))* @@FIN ;

%   An alternative is a sequence of statements separated by /'s;
ALT: SEQ < '/ ALT +(OR #2 #1) >;

%   A sequence is a list of items that must be matched.
SEQ: REP < SEQ +(AND #2 (FAIL!-NOT #1)) >;
Parser Tools                  7 February 1983                    PSL Manual
page 22.14                                                     section 22.3

%   A repetition may be 0 or more single items (*) or 0 or more items
%    separated by any token (ID[,]* parses a list of ID's separated
%    by ,'s.
REP: ONE
      <'[ (ID +(#1) /
           '' ANYKEY +(EQTOK!-NEXT (QUOTE #1)) /
     ANYKEY +(EQTOK!-NEXT (QUOTE #1))) '] +(AND #2 #1) '* BLD!-EXPR /
        '* BLD!-EXPR>;

%   Create an sexpression to build a repetition.
BLD!-EXPR: +(PROG (X) (SETQ X (STK!-LENGTH))
                   $1 (COND (#1 (GO $1)))
                      (BUILD!-REPEAT X)
                      (RETURN T));

ANYKEY: ANYTOK .(ADDKEY ##1) ;  % Add a new KEY

%   One defines a single item.
ONE: '' ANYKEY  +(EQTOK!-NEXT (QUOTE #1)) /
     '@ ANYKEY  .(ADDRTERM ##1)  +(EQTOK (QUOTE #1)) /
     '@@ ANYKEY .(ADDGTERM ##1)  +(EQTOK (QUOTE #1)) /
     '+ UNLBLD  +(PUSH #1) /
     '. EVLBLD  +(PROGN #1 T) /
     '= EVLBLD  /
     '< ALT '>  +(PROGN #1 T) /
     '( ALT ')  /
     '+. EVLBLD +(PUSH #1) /
     ID         +(#1) ;

%   This rule defines an un evaled list.  It builds a list with
%   everything quoted.
UNLBLD: '( UNLBLD ('. UNLBLD ') +(CONS #2 #1) /
                    UNLBLD* ') +(LIST . (#2 . #1)) /
                   ') +(LIST . #1)) /
        LBLD    /
        ID      +(QUOTE #1) ;

%   EVLBLD builds a list of evaled items.
EVLBLD: '( EVLBLD ('. EVLBLD ') +(CONS #2 #1) /
                    EVLBLD* ') +(#2 . #1) /
                   ') ) /
        LBLD /
        ID      ;

LBLD: '# NUM    +(EXTRACT #1) /
      '## NUM   +(REF #1) /
      '$ NUM    +(GENLAB #1) /
      '& NUM    +(CADR (ASSOC #1 (CAR VARLIST))) /
      NUM       /
      STR       /
      '' ('( UNLBLD* ') +(LIST . #1) /
           ANYTOK +(QUOTE #1));
PSL Manual                    7 February 1983                  Parser Tools
section 22.3                                                     page 22.15


%   Defines the pattern matching rules (PATTERN -> BODY).
PRUL: .(SETQ INDEXLIST!* NIL)
      PAT '-> (EVLBLD)*
             +(LAMBDA (VARLIST T1 T2 T3) (AND . #1))
             .(SETQ PNAM (GENSYM))
             .(RULE!-DEFINE (LIST 'PUTD (LIST 'QUOTE PNAM)
                '(QUOTE EXPR) (LIST 'QUOTE #1)))
             +.(CONS #1 PNAM);

%   Defines a pattern.
%   We now allow the . operator to be the next to last in a ().
PAT: '& ('< PSIMP[/]* '> NUM
             +.(PROGN (SETQ INDEXLIST!* (CONS ##1 INDEXLIST!*))
                  (LIST '!& #2 #1) ) /
             NUM
               +.(COND ((MEMQ ##1 INDEXLIST!*)
                         (LIST '!& '!& #1))
                  (T (PROGN (SETQ INDEXLIST!* (CONS ##1 INDEXLIST!*))
                         (LIST '!& #1)))) )
        / ID
        / '!( PAT* <'. PAT +.(APPEND #2 #1)> '!)
        / '' ANYTOK
        / STR
        / NUM ;

%   Defines the primitives in a pattern.
PSIMP: ID / NUM / '( PSIMP* ') / '' ANYTOK;

%   The grammar terminator.
FIN



22.3.8. The Construction of MINI
22.3.8. The Construction of MINI
22.3.8. The Construction of MINI

  MINI  is  actually  described  in  terms  of  a  support  package for any
MINI-generated parser and a self-description of MINI.  The useful files (on
PU: and PL:) are as follows:


MINI.MIN  The self definition of MINI in MINI.
MINI.SL   A Standard LISP version of MINI.MIN, translated by MINI itself.
MINI.RED  The support RLISP for MINI.
MINI-PATCH.RED and MINI.FIX
          Some additions being tested.
MINI.LAP  The precompiled LAP file.  Use LOAD MINI.
MINI-LAP-BUILD.CTL
          A batch file that builds PL:MINI.LAP from the above files.
MINI-SELF-BUILD.CTL
          A batch  file  that  builds  the  MINI.SL  file  by  loading  and
          translating MINI.MIN.
Parser Tools                  7 February 1983                    PSL Manual
page 22.16                                                     section 22.3

22.3.9. History of MINI Development
22.3.9. History of MINI Development
22.3.9. History of MINI Development

  The MINI Translator Writing System was developed in two steps.  The first
was the enhancement of the META/RLISP [Marti 79] system with the definition
of  pattern  matching  primitives  to  aid  in  describing  and  performing
tree-to-tree transformations.  META/RLISP is very proficient at translating
an input programming language into LISP or LISP-like  trees,  but  did  not
have  a good method for manipulating the trees nor for direct generation of
target machine code.  PMETA  (as  it  was  initially  called) [Kessler  79]
solved  these  problems  and  created  a  very  good  environment  for  the
development of compilers.  In fact, the PMETA enhancements have been  fully
integrated into META/RLISP.

  The  second step was the elimination of META/RLISP and the development of
a smaller, faster system (MINI).  Since META/RLISP was designed to  provide
maximum  flexibility  and  full generality, the parsers that is creates are
large and slow.  One of its most significant problems is that it  uses  its
own   single  character  driven  LISP  functions  for  token  scanning  and
recognition.    Elimination  of  this  overhead  has  produced   a   faster
translator.  MINI uses the hand coded scanner in the underlying RLISP.  The
other  main  aspect  of  MINI  was  the  elimination  of various META/RLISP
features  to  decrease  the  size  of  the  system  (also  decreasing   the
flexibility, but MINI has been successful for the various purposes in COG).
MINI  is  now small enough to run on small LISP systems (as long as a token
scanner is provided).  The META/RLISP features that  MINI  has  changed  or
eliminated include the following:


   a. The ability to backup the parser state upon failure is supported
      in  META/RLISP.  However, by modifying a grammar definition, the
      need  for  backup  can  be  mostly  avoided  and  was  therefore
      eliminated from MINI.

   b. META/RLISP  has  extensive  mechanisms to allow arbitrary length
      diphthongs.    MINI  only  supports  two  character  diphthongs,
      declared prior to their use.

   c. The  target  machine  language and error specification operators
      are not supported because they can be implemented  with  support
      routines.

   d. RLISP  subsyntax for specification of semantic operations is not
      supported (only LISP is provided).


Although MINI lacks many of the features of META/RLISP, it still  has  been
quite sufficient for a variety of languages.
PSL Manual                    7 February 1983                  Parser Tools
section 22.4                                                     page 22.17

22.4. BNF Description of RLISP Using MINI
22.4. BNF Description of RLISP Using MINI
22.4. BNF Description of RLISP Using MINI

  The  following  formal scheme for the translation of RLISP syntax to LISP
syntax is presented to eliminate misinterpretation of the definitions.   We
have used the above MINI syntactic form since it is close enough to BNF and
has also been checked mechanically.

  Recall   that   the   transformation   scheme  produces  an  S-expression
corresponding to the input RLISP expression.  A rule has a name by which it
is known and is defined by what follows the meta symbol :.   Each  rule  of
the set consists of one or more "alternatives" separated by the meta symbol
/,  being  the  different ways in which the rule is matched by source text.
Each rule ends with a ;.  Each alternative is composed  of  a  "recognizer"
and  a "generator".  The "generator" is a MINI + expression which builds an
S-expression from constants and elements loaded on the stack.   The  result
is  then  loaded  on the stack.  The #n and ##n refer to elements loaded by
MINI primitives or other rules.  The "generator" is thus  a  template  into
which previously generated items are substituted.  Recall that terminals in
both recognizer and generator are quoted with a ' mark.

  This  RLISP/SYSLISP  syntax  is  based  on  a  series  of  META  and MINI
definitions, started by R. Loos in 1970, continued by M. Griss,  R. Kessler
and A. Wang.

  [??? This MINI.RLISP grammar is a bit out of date ???]
  [??? This MINI.RLISP grammar is a bit out of date ???]
  [??? This MINI.RLISP grammar is a bit out of date ???]


  [??? Need to confirm for latest RLISP ???]
  [??? Need to confirm for latest RLISP ???]
  [??? Need to confirm for latest RLISP ???]



mini 'rlisp;

dip: !: , !<!< , !>!> , !:!= , !*!* , !<!= , !>!= , !' , !#!# ;

termin: '; / '$ ;               % $ used to not echo result
rtermin: @; / @$ ;

rlisp: ( cmds rtermin  .(next!-tok) )* ; % Note explicit Scan

cmds:  procdef / rexpr ;

%------ Procedure definition:

procdef: emodeproc (ftype procs/ procs) /
         ftype procs / procs ;

ftype:   'fexpr .(setq FTYPE!* 'fexpr) /  % function type
         'macro .(setq FTYPE!* 'macro) /
         'smacro .(setq FTYPE!* 'smacro) /
         'nmacro .(setq FTYPE!* 'nmacro) /
         ('expr / =T) .(setq FTYPE!* 'expr) ;
Parser Tools                  7 February 1983                    PSL Manual
page 22.18                                                     section 22.4



emodeproc: 'syslsp .(setq EMODE!* 'syslsp)/
           ('lisp/'symbolic/=T)  .(setq EMODE!* 'symbolic) ;


procs: 'procedure id proctail
           +(putd (quote #2) (quote FTYPE!* ) #1) ;

proctail: '( id[,]* ')  termin  rexpr +(quote (lambda #2 #1)) /
           termin  rexpr +(quote (lambda nil #1)) /
          id  termin  rexpr +(quote (lambda (#2) #1)) ;

%------ Rexpr definition:

rexpr: disjunction ;

disjunction: conjunction (disjunctail / =T) ;

disjunctail: ('or conjunction ('or conjunction)*)
              +.(cons 'or  (cons #3 (cons #2 #1))) ;

conjunction: negation (conjunctail / =T) ;

conjunctail: ('and negation ('and negation)*)
             +.(cons (quote and) (cons #3 (cons #2 #1))) ;

negation: 'not negation +(null #1) /
          'null negation +(null #1) /
          relation ;

relation: term reltail ;

reltail: relop term +(#2 #2 #1) / =T ;

term: ('- factor +(minus #1) / factor) termtail ;

termtail: (plusop factor +(#2 #2 #1) termtail) / =T ;

factor: powerexpr factortail ;

factortail: (timop powerexpr +(#2 #2 #1) factortail) / =T ;

powerexpr: dotexpr powtail ;

powtail: ('** dotexpr +(expt #2 #1) powtail) / =T ;

dotexpr: primary dottail ;

dottail: ('. primary +(cons #2 #1) dottail) / =T ;

primary: ifstate / groupstate / beginstate /
PSL Manual                    7 February 1983                  Parser Tools
section 22.4                                                     page 22.19

         whilestate / repeatstate / forstmts /
         definestate / onoffstate / lambdastate /
         ('( rexpr ') ) /
         ('' (lists / id / num) +(quote #1)) /
         id primtail / num ;

primtail:(':= rexpr +(setq #2 #1)) /
         (': labstmts ) /
         '( actualst / (primary +(#2 #1)) / =T ;

lists: '( (elements)* ') ;

elements: lists / id / num ;

%------ If statement:

ifstate: 'if rexpr 'then rexpr elserexpr
              +(cond (#3 #2) (T #1)) ;

elserexpr: 'else rexpr / =T +nil ;

%------ While statement:

whilestate: 'while rexpr 'do rexpr
            +(while #2 #1) ;

%----- Repeat statement:

repeatstate: 'repeat rexpr 'until rexpr
             +(repeat #2 #1) ;

%---- For statement:

forstmts: 'for fortail ;

fortail: ('each foreachstate) / forstate ;

foreachstate: id inoron rexpr actchoice rexpr
              +(foreach #5 #4 #3 #2 #1) ;

inoron: ('in +in / 'on +on) ;

actchoice: ('do +do / 'collect +collect / 'conc +conc) ;

forstate: id ':= rexpr loops ;

loops: (': rexpr types rexpr
       +(for #5 (#4 1 #3) #2 #1) ) /
       ('step rexpr 'until rexpr types rexpr
       +(for #6 (#5 #4 #3) #2 #1) ) ;

types: ('do +do / 'sum +sum / 'product +product) ;
Parser Tools                  7 February 1983                    PSL Manual
page 22.20                                                     section 22.4


%----- Function call parameter list:

actualst: ') +(#1) / rexpr[,]* ') +.(cons #2 #1) ;

%------ Compound group statement:

groupstate: '<< rexprlist '>> +.(cons (quote progn) #1) ;

%------ Compound begin-end statement:

beginstate: 'begin blockbody 'end ;

blockbody: decllist blockstates
            +.(cons (quote prog) (cons #2 #1)) ;

decllist: (decls[;]* +.(flatten #1)) / (=T +nil) ;

decls: ('integer  / 'scalar) id[,]* ;

blockstates: labstmts[;]* ;

labstmts: ('return rexpr +(return #1)) /
          (('goto / 'go 'to) id +(go #1)) /
          ('if rexpr 'then labstmts blkelse
               +(cond (#3 #2) (T #1))) /
          rexpr ;

blkelse: 'else labstmts / =T +nil ;

rexprlist: rexpr [;]* ;

lambdastate: 'lambda lamtail ;

lamtail: '( id[,]* ')  termin  rexpr +(lambda #2 #1) /
          termin  rexpr +(lambda nil #1) /
         id  termin  rexpr +(lambda (#2) #1) ;

%------ Define statement: (id and value are put onto table
%       named DEFNTAB:

definestate: 'define delist +.(cons (quote progn) #1) ;

delist: (id '= rexpr +(put (quote #2)  (quote defntab)
              (quote #1)))[,]* ;

%------ On or off statement:

onoffstate: ('on +T / 'off +nil) switchlists ;

switchlists: 'defn +(set '!*defn #1) ;
PSL Manual                    7 February 1983                  Parser Tools
section 22.4                                                     page 22.21

timop: ('* +times / '/ +quotient) ;

plusop: ('+ +plus2 / '- +difference) ;

relop: ('< +lessp / '<= +lep / '= +equal /
           '>= +gep / '> +greaterp) ;


FIN


REDUCE Historical
REDUCE Sourceforge Project | Historical SVN Repository | GitHub Mirror | SourceHut Mirror | NotABug Mirror | Chisel Mirror | Chisel RSS ]