Artifact 5482c246b16acb1f3fe51e993f40ed58a5da056f390c9d197ba648f7fff147ae:
- File
psl-1983/3-1/lpt/22-parser.lpt
— part of check-in
[eb17ceb7f6]
at
2020-04-21 19:40:01
on branch master
— Add Reduce 3.0 to the historical section of the archive, and some more
files relating to version sof PSL from the early 1980s. Thanks are due to
Paul McJones and Nelson Beebe for these, as well as to all the original
authors.git-svn-id: https://svn.code.sf.net/p/reduce-algebra/code/historical@5328 2bfe0521-f11c-4a00-b80e-6202646ff360 (user: arthurcnorman@users.sourceforge.net, size: 43763) [annotate] [blame] [check-ins using] [more...]
- File
psl-1983/lpt/22-parser.lpt
— part of check-in
[eb17ceb7f6]
at
2020-04-21 19:40:01
on branch master
— Add Reduce 3.0 to the historical section of the archive, and some more
files relating to version sof PSL from the early 1980s. Thanks are due to
Paul McJones and Nelson Beebe for these, as well as to all the original
authors.git-svn-id: https://svn.code.sf.net/p/reduce-algebra/code/historical@5328 2bfe0521-f11c-4a00-b80e-6202646ff360 (user: arthurcnorman@users.sourceforge.net, size: 43763) [annotate] [blame] [check-ins using]
PSL Manual 7 February 1983 Parser Tools section 22.0 page 22.1 CHAPTER 22 CHAPTER 22 CHAPTER 22 PARSER TOOLS PARSER TOOLS PARSER TOOLS 22.1. Introduction . . . . . . . . . . . . . . . 22.1 22.2. The Table Driven Parser . . . . . . . . . . . 22.2 22.2.1. Flow Diagram for the Parser. . . . . . . . 22.2 22.2.2. Associating the Infix Operator with a Function . 22.4 22.2.3. Precedences . . . . . . . . . . . . . 22.5 22.2.4. Special Cases of 0 <-0 and 0 0. . . . . . . 22.5 22.2.5. Parenthesized Expressions . . . . . . . . 22.5 22.2.6. Binary Operators in General. . . . . . . . 22.6 22.2.7. Assigning Precedences to Key Words . . . . . 22.7 22.2.8. Error Handling . . . . . . . . . . . . 22.7 22.2.9. The Parser Program for the RLISP Language . . . 22.7 22.2.10. Defining Operators . . . . . . . . . . 22.8 22.3. The MINI Translator Writing System. . . . . . . . 22.10 22.3.1. A Brief Guide to MINI. . . . . . . . . . 22.10 22.3.2. Pattern Matching Rules . . . . . . . . . 22.12 22.3.3. A Small Example. . . . . . . . . . . . 22.12 22.3.4. Loading Mini. . . . . . . . . . . . . 22.13 22.3.5. Running Mini. . . . . . . . . . . . . 22.13 22.3.6. MINI Error messages and Error Recovery . . . . 22.13 22.3.7. MINI Self-Definition . . . . . . . . . . 22.13 22.3.8. The Construction of MINI. . . . . . . . . 22.15 22.3.9. History of MINI Development. . . . . . . . 22.16 22.4. BNF Description of RLISP Using MINI . . . . . . . 22.17 22.1. Introduction 22.1. Introduction 22.1. Introduction In many applications, it is convenient to define a special "problem-oriented" language, tailored to provide a natural input format. Examples include the RLISP ALGOL-like surface language for algebraic work, graphics languages, boolean query languages for data-base, etc. Another ________ important case is the requirement to accept existing programs in some language, either to translate them to another language, to compile to machine language, to be able to adapt existing code into the PSL environment (e.g. mathematical libraries, etc.), or because we wish to use PSL based tools to analyze a program written in another language. One approach is to hand-code a program in PSL (called a "parser") that translates the input language to the desired form; this is tedious and error prone, and it is more convenient to use a "parser-writing-tool". In this Chapter we describe in detail two important parser writing tools available to the PSL programmer: an extensible table-driven parser that is used for the RLISP parser (described in Chapter 3), and the MINI parser generator. The table-driven parser is most useful for languages that are Parser Tools 7 February 1983 PSL Manual page 22.2 section 22.1 simple extensions of RLISP, or in fact for rapidly adding new syntactic constructs to RLISP. The MINI system is used for the development of more complete user languages. 22.2. The Table Driven Parser 22.2. The Table Driven Parser 22.2. The Table Driven Parser The parser is a top-down recursive descent parser, which uses a table of ___________ Precedences to control the parse; if numeric precedence is not adequate, LISP functions may be inserted into the table to provide more control. The parser described here was developed by Nordstrom [Nordstrom 73], and is very similar to parser described by Pratt [Pratt 73], and apparently used for the CGOL language, another LISP surface language. Scan Scan Scan Scan The parser reads tokens from an input stream using a function Scan. Scan ChannelReadToken ChannelReadToken calls the ChannelReadToken function described in Chapter 12, and performs some additional checks, described below. Each token is defined to be one of the following: non-operator O right operator O-> binary operator <-O-> All combinations of . . .O-> O. . . and O <-O->. . . are supposed to be legal, while the combinations . . .O-> <-O->. . ., . . .<-O-> <-O->. . . and O O. . . are normally illegal (error ARG MISSING and error OP MISSING, respectively). __ With each operator (which must be an id) is associated a construction function, a right precedence, and for binary operators, a left precedence. The Unary Prefix operators have this information stored under the indicator 'RLISPPREFIX and Binary operators have it stored under 'RLISPINFIX. (Actually, the indicator used at any time during parsing is the VALUE of GRAMPREFIX or GRAMINFIX, which may be changed by the user). 22.2.1. Flow Diagram for the Parser 22.2.1. Flow Diagram for the Parser 22.2.1. Flow Diagram for the Parser In this diagram RP stands for Right Precedence, LP for Left Precedence and CF for Construction Function. OP is a global variable which holds the current token. PSL Manual 7 February 1983 Parser Tools section 22.2 page 22.3 procedure PARSE(RP); RDRIGHT(RP,SCAN()); % SCAN reads next token RDRIGHT(RP,Y) | \|/ | ------------------------ | |yes | Y is Right OP |-----> Y:=APPLY(Y.CF, | | RDRIGHT(Y.RP)); ------------------------ . | . \|/ no . | . ------------------------ . ERROR yes| | no . ARG <----| Y is Binary OP |----> OP:= . MISSING | | SCAN(); . ------------------------ . . |--------<------------<------* RDLEFT: \|/ ^ | ^ ------------------------ ^ ERROR no| | ^ OP <----| OP is Binary | ^ MISSING | | ^ ------------------------ ^ | ^ \|/ yes ^ | ^ ------------------------ ^ RETURN yes| |no ^ (Y) <----| RP > OP.lp |---> Y:=APPLY(OP.cf,Y, ------------------------ PARSE(OP.lp,SCAN()); Parser Tools 7 February 1983 PSL Manual page 22.4 section 22.2 This diagram reflects the major behavior, though some trivial additions are included in the RLISP case to handle cases such as OP-> <-OP, '!;, etc. [See PU:RLISP-PARSER.RED for full details.] The technique involved may also be described by the following figure: . . . 0-> Y <-0 . . . rp lp Y is a token or an already parsed expression between two operators (as indicated). If 0->'s RP is greater than <-0's LP, then 0-> is the winner and Y goes to 0->'s construction function (and vice versa). The result from the construction function is a "new Y" in another parse situation. By associating precedences and construction functions with the operators, we are now able to parse arithmetic expressions (except for function calls) and a large number of syntactical constructions such as IF - THEN - ELSE - ; etc. The following discussion of how to expand the parser to cover a language such as RLISP (or ALGOL) may also be seen as general tools for handling the parser and defining construction functions and precedences. 22.2.2. Associating the Infix Operator with a Function 22.2.2. Associating the Infix Operator with a Function 22.2.2. Associating the Infix Operator with a Function Scan RAtomHook Scan RAtomHook __ __ The Scan, after calling RAtomHook, checks ids and special ids (those with TOKTYPE!* = 3) to see if they should be renamed from external form to Plus2 Plus2 internal form (e.g. '!+ to Plus2). This is done by checking for a NEWNAM __ __ or NEWNAM!-OP property on the id. For special ids, the NEWNAM!-OP property is first checked. The value of the property is a replacement token, i.e. PUT('!+,'NEWNAM!-OP,'PLUS2) has been done. Scan RlispRead Scan RlispRead Scan also handles the ' mark, calling RlispRead to get the S-expression. RlispRead Read RlispRead Read RlispRead is a version of Read, using a special SCANTABLE, RLISPREADSCANTABLE!*. Scan Scan The function Scan also sets SEMIC!* to '!; or '!$ if CURSYM!* is detected to be '!*SEMICOL!* (the internal name for '!; and "!$). This controls the RLISP echo/no-echo capability. Finally, if the renamed token is 'COMMENT ReadCh ReadCh then characters are ReadCh'd until a '!; or '!$ . PSL Manual 7 February 1983 Parser Tools section 22.2 page 22.5 22.2.3. Precedences 22.2.3. Precedences 22.2.3. Precedences To set up precedences, it is often helpful to set up a precedence matrix of the operators involved. If any operator has one "precedence" with respect to one particular operator and another "precedence" with respect to some other, it is sometimes not possible to run the parser with just numbered precedences for the operators without introducing ambiguities. If this is the case, replace the number RP by the operator RP and test with something like: IF RP *GREATER* OP . . . *GREATER* may check in the precedence matrix. An example in which such a scheme might be used is the case for which ALGOL uses ":" both as a label marker and as an index separator (although in this case there is no need for the change above). It is also a good policy to have even numbers for right precedences and odd numbers for left precedences (or vice versa). 22.2.4. Special Cases of 0 <-0 and 0 0 22.2.4. Special Cases of 0 <-0 and 0 0 22.2.4. Special Cases of 0 <-0 and 0 0 If . . .0 0. . . is a legal case (i.e. F A may translate to (F A)), ERROR OP MISSING is replaced by: Y:=REPCOM(Y,RDRIGHT(99,OP)); GO TO RDLEFT; The value 99 is chosen in order to have the first object (F) behave as a right operator with maximum precedence. If . . .0 <-0. . . is legal for some combinations of operators, replace ERROR ARG MISSING by something equivalent to the illegal RLISP statement: IF ISOPOP(OP,RP,Y) THEN <<OP:=Y; Y:=(something else, i.e. NIL); GOTO RDLEFT>> ELSE ERROR ARG MISSING; ISOPOP is supposed to return T if the present situation is legal. 22.2.5. Parenthesized Expressions 22.2.5. Parenthesized Expressions 22.2.5. Parenthesized Expressions (a) is to be translated to a. E.g. Parser Tools 7 February 1983 PSL Manual page 22.6 section 22.2 BEGIN a END translates to (PROG a). Define "(" and BEGIN as right operators with low precedences (2 and -2 respectively). Also define ")" and END as binary operators with matching left precedences (1 and -3 respectively). The construction functions for "(" and BEGIN are then something like: [See pu:RLISP-PARSER.RED for exact details on ParseBEGIN] BEGIN (X);PROG2(OP:=SCAN();MAKEPROG(X)); "(" (X);PROG2(IF OP=') THEN OP:=SCAN() ELSE ERROR, x); Note that the construction functions in these cases have to read the next token; that is the effect of ")" closing the last "(" and not all earlier "("'s. This is also an example of binary operators declared only for the purpose of having a left precedence. 22.2.6. Binary Operators in General 22.2.6. Binary Operators in General 22.2.6. Binary Operators in General As almost all binary operators have a construction function like LIST(OP,X,Y); it is assumed to be of that kind if no other is given. If OP is a binary operator, then "a OP b OP c" is interpreted as "(a OP b) OP c" only if OP's LP is less than OP's RP. Example: A + B + C translates to (A + B) + C because +'RP = 20 and +'LP = 19 A ^ B ^ C translates to A ^ (B ^ C) because ^'RP = 20 and ^'LP = 21 If you want some operators to translate to n-ary expressions, you have to define a proper construction function for that operator. Example: PLUS (X,Y); IF CAR(X) = 'PLUS THEN NCONC(X,LIST(Y)) ELSE LIST('PLUS,X,Y); PSL Manual 7 February 1983 Parser Tools section 22.2 page 22.7 By defining "," and ";" as ordinary binary operators, the parser automatically takes care of constructions like . . .e,e,e,e,e. . . and . . . stm;stm;stm;stm;. . . It is then up to some other operators to remove the "," or the ";" from the parsed result. 22.2.7. Assigning Precedences to Key Words 22.2.7. Assigning Precedences to Key Words 22.2.7. Assigning Precedences to Key Words If you want some operators to have control immediately, insert IF RP = NIL THEN RETURN Y ELSE as the very first test in RDRIGHT and set the right precedence of those to NIL. This is sometimes useful for key-word expressions. If entering a construction function of such an operator, X is the token immediately after the operator. E.g.: We want to parse PROCEDURE EQ(X,Y); . . . Define PROCEDURE as a right operator with NIL as precedence. The construction function for PROCEDURE can always call the parser and set the rest of the expression. Note that if PROCEDURE was not defined as above, the parser would misunderstand the expression in the case of EQ as declared as a binary operator. 22.2.8. Error Handling 22.2.8. Error Handling 22.2.8. Error Handling For the present, if an error occurs a message is printed but no attempt is made to correct or handle the error. Mostly the parser goes wild for a while (until a left precedence less than current right precedence is found) and then goes on as usual. 22.2.9. The Parser Program for the RLISP Language 22.2.9. The Parser Program for the RLISP Language 22.2.9. The Parser Program for the RLISP Language SCAN(); The purpose of this function is to read the next token from the input stream. It uses the general purpose table driven token scanner described in Chapter 12, with a specially set up ReadTable, RLISPSCANTABLE!*. As Scan __________ Scan RLISP has multiple identifiers for the same operators, Scan uses the following translation table: = EQUAL >= GEQ + PLUS > GREATERP - DIFFERENCE <= LEQ / QUOTIENT < LESSP . CONS * TIMES := SETQ ** EXPT Scan Scan In these cases, Scan returns the right hand side of the table values. Scan Scan Also, two special cases are taken care of in Scan: Parser Tools 7 February 1983 PSL Manual page 22.8 section 22.2 a. ' is the QUOTE mark. If a parenthesized expression follows ' then the syntax within the parenthesis is that of LISP, using a special scan table, RLISPREADSCANTABLE!*. The only major difference from ordinary LISP is that ! is required for all special characters. b. ! in RLISP means actually two things: i. the following symbol is not treated as a special symbol (but belongs to the print name of the atom in process); ii. the atom created cannot be an operator. Example: !( in the text behaves as the atom "(". To signal to the parser that this is the case, the flag variable ESCAPEFL must be set to T if this situation occurs. 22.2.10. Defining Operators 22.2.10. Defining Operators 22.2.10. Defining Operators To define operators use: DEFINEROP(op,p{,stm}); For right or prefix operators. DEFINEBOP(op,lp,rp{,stm}); For binary operators. These use the VALUE of DEFPREFIX and DEFINFIX to store the precedences and construction functions. The default is set for RLISP, to be __________ 'RLISPPREFIX and 'RLISPINFIX. The same identifier can be defined both as the right and binary operator. The context defines which one applies. Stm is the construction function. If stm is omitted, the common defaults are used: LIST(OP,x) prefix case, x is parsed expression following, x=RDRIGHT(p,SCAN()). LIST(OP,x,y) binary case, x is previously parsed expression, y is expression following, y=RDRIGHT(rp,SCAN()). __ If stm is an id, it is assumed to be a procedure of one or two arguments, PSL Manual 7 February 1983 Parser Tools section 22.2 page 22.9 for "x" or "x,y". If it is an expression, it is embedded as (LAMBDA(X) stm) or (LAMBDA(X Y) stm), and should refer to X and Y, as needed. Also remember that the free variable OP holds the last token (normally the binary operator which stopped the parser). If "p" or "rp" is NIL, RDRIGHT is not called by default, so that only SCAN() (the next token) is passed. For example, DEFINEBOP('DIFFERENCE,17,18); % Most common case, left associative, stm=LIST(OP,x,y); DEFINEBOP('CONS,23,21); % Right Associative, default stm=LIST(OP,x,y) DEFINEBOP('AND,11,12,ParseAND); % Left Associative, special function PROCEDURE ParseAND(X,Y); NARY('AND,X,Y); DEFINEBOP('SETQ,7,6,ParseSETQ); % Right Associative, Special Function PROCEDURE ParseSETQ(LHS,RHS); LIST(IF ATOM LHS THEN 'SETQ ELSE 'SETF, LHS, RHS); DEFINEROP('MINUS,26); % default C-fn, just (list OP arg) DEFINEROP('PLUS,26,ParsePLUS1); % DEFINEROP('GO,NIL,ParseGO ); % Special Function, DO NOT use default PARSE ahead PROCEDURE ParseGO X; X is now JUST next-token IF X EQ 'TO THEN LIST('GO,PARSE0(6,T)) % Explicit Parse ahead ELSE <<OP := SCAN(); % get Next Token LIST('GO,X)>>; DEFINEROP('GOTO,NIL,ParseGOTO ); % Suppress Parse Ahead, just pass NextToken PROCEDURE ParseGOTO X; <<OP := SCAN(); LIST('GO,X)>>; Parser Tools 7 February 1983 PSL Manual page 22.10 section 22.3 22.3. The MINI Translator Writing System 22.3. The MINI Translator Writing System 22.3. The MINI Translator Writing System Note that MINI is now autoloading. 22.3.1. A Brief Guide to MINI 22.3.1. A Brief Guide to MINI 22.3.1. A Brief Guide to MINI The following is a brief introduction to MINI, the reader is referred to [Marti 79] for a more detailed discussion of the META/RLISP operators, which are very similar to those of MINI. The MINI system reads in a definition of a translator, using a BNF-like form. This is processed by MINI into a set of LISP functions, one for each production, which make calls on each other, and a set of support routines that recognize a variety of simple constructs. MINI uses a stack to perform parsing, and the user can access sub-trees already on the stack, replacing them by other trees built from these sub-trees. The primitive __ _______ functions that recognize ids, integers, etc. each place their recognized token on this stack. For example, FOO: ID '!- ID +(PLUS2 #2 #1) ; defines a rule FOO, which recognizes two identifiers separated by a minus __________ sign (each ID pushes the recognized identifier onto the stack). The last expression replaces the top 2 elements on the stack (#2 pops the first ID pushed onto the stack, while #1 pops the other) with a LISP statement. Id Id _______ ____ (Id ): boolean expr __________ See if current token is an identifier and not a keyword. If it is, then push onto the stack and fetch the next token. AnyId AnyId _______ ____ (AnyId ): boolean expr __ See if current token is an id whether or not it is a key word. AnyTok AnyTok _______ ____ (AnyTok ): boolean expr Always succeeds by pushing the current token onto the stack. Num Num _______ ____ (Num ): boolean expr ______ Tests to see if the current token is a number, if so it pushes ______ the number onto the stack and fetches the next token. PSL Manual 7 February 1983 Parser Tools section 22.3 page 22.11 Str Str _______ ____ (Str ): boolean expr Num Num ______ Same as Num, except for strings. Specification of a parser using MINI consists of defining the syntax with BNF-like rules and semantics with LISP expressions. The following is a brief list of the operators: ' Used to designate a terminal symbol (i.e. 'WHILE, 'DO, '!=). Identifier Specifies a nonterminal. ( ) Used for grouping (i.e. (FOO BAR) requires rule FOO to parse followed immediately by BAR). < > Optional parse, if it fails then continue (i.e. <FOO> tries to parse FOO). / Optional rules (i.e. FOO / BAR allows either FOO or BAR to parse, with FOO tested first). STMT* Parse any number of STMT. STMT[ANYTOKEN]* Parse any number of STMT separated by ANYTOKEN, create a list and __________ push onto the stack (i.e. ID[,]* parses a number of identifiers separated by commas, like in an argument list). _______ ##n Refer to the nth stack location (n must be an integer). _______ #n Pop the nth stack location (n must be an integer). +(STMT) Push the unevaluated (STMT) onto the stack. .(SEXPR) Evaluate the SEXPR and ignore the result. =(SEXPR) Evaluate the SEXPR and test if result non-NIL. +.(SEXPR) Evaluate the SEXPR and push the result on the stack. @ANYTOKEN Specifies a statement terminator; used in the error recovery mechanism to search for the occurrence of errors. @@ANYTOKEN Grammar terminator; also stops scan, but if encountered in error-recovery, terminates grammar. Parser Tools 7 February 1983 PSL Manual page 22.12 section 22.3 22.3.2. Pattern Matching Rules 22.3.2. Pattern Matching Rules 22.3.2. Pattern Matching Rules In addition to the BNF-like rules that define procedures with 0 arguments and which scan tokens by calls on NEXT!-TOK() and operate on the stack, MINI also includes a simple TREE pattern matcher and syntax to define PatternProcedures that accept and return a single argument, trying a series of patterns until one succeeds. E.g. template -> replacement PATTERN = (PLUS2 &1 0) -> &1, (PLUS2 &1 &1) -> (LIST 'TIMES2 2 &1), &1 -> &1; defines a pattern with 3 rules. &n is used to indicate a matched sub-tree in both the template and replacement. A repeated &n, as in the second Equal Equal rule, requires Equal sub-trees. 22.3.3. A Small Example 22.3.3. A Small Example 22.3.3. A Small Example % A simple demo of MINI, to produce a LIST-NOTATION reader. % INVOKE 'LSPLOOP reads S-expressions, separated by ; mini 'lsploop; % Invoke MINI, give name of ROOT % Comments can appear anywhere, % prefix by % to end-of-line lsploop:lsp* @@# ; % @@# is GRAMMAR terminator % like '# but stops TOKEN SCAN lsp: sexp @; % @; is RULE terminator, like '; .(print #1) % but stops SCAN, to print .(next!-tok) ; % so call NEXT!-TOK() explicitly sexp: id / num / str / '( dotexp ') ; dotexp: sexp* < '. sexp +.(attach #2 #1) > ; fin symbolic procedure attach(x,y); <<for each z in reverse x do y:=z . y; y>>; 22.3.4. Loading Mini 22.3.4. Loading Mini 22.3.4. Loading Mini MINI is loaded from PH: using LOAD MINI;. PSL Manual 7 February 1983 Parser Tools section 22.3 page 22.13 22.3.5. Running Mini 22.3.5. Running Mini 22.3.5. Running Mini Invoke Invoke A MINI grammar is run by calling Invoke rootname;. This installs appropriate Key Words (stored on the property list of rootname), and start the grammar by calling the Rootname as first procedure. 22.3.6. MINI Error messages and Error Recovery 22.3.6. MINI Error messages and Error Recovery 22.3.6. MINI Error messages and Error Recovery If MINI detects a non-fatal error, a message be printed, and the current token and stack is shown. MINI then calls NEXT!-TOK() repeatedly until either a statement terminator (@ANYTOKEN) or grammar terminator (@ANYTOKEN) is seen. If a grammar terminator, the grammar is exited; otherwise parsing resumes from the ROOT. [??? Interaction with BREAK loop rather poor at the moment ???] [??? Interaction with BREAK loop rather poor at the moment ???] [??? Interaction with BREAK loop rather poor at the moment ???] 22.3.7. MINI Self-Definition 22.3.7. MINI Self-Definition 22.3.7. MINI Self-Definition % The following is the definition of the MINI meta system in terms of % itself. Some support procedures are needed, and exist in a % separate file. % To define a grammar, call the procedure MINI with the argument % being the root rule name. Then when the grammar is defined it may % be called by using INVOKE root rule name. % The following is the MINI Meta self definition. MINI 'RUL; % Define the diphthongs to be used in the grammar. DIP: !#!#, !-!>, !+!., !@!@ ; % The root rule is called RUL. RUL: ('DIP ': ANYTOK[,]* .(DIPBLD #1) '; / (ID .(SETQ !#LABLIST!# NIL) ( ': ALT +(DE #2 NIL #1) @; / '= PRUL[,]* @; .(RULE!-DEFINE '(PUT(QUOTE ##2)(QUOTE RB) (QUOTE #1))) +(DE ##1 (A) (RBMATCH A (GET (QUOTE #1) (QUOTE RB)) NIL))) .(RULE!-DEFINE #1) .(NEXT!-TOK) ))* @@FIN ; % An alternative is a sequence of statements separated by /'s; ALT: SEQ < '/ ALT +(OR #2 #1) >; % A sequence is a list of items that must be matched. SEQ: REP < SEQ +(AND #2 (FAIL!-NOT #1)) >; Parser Tools 7 February 1983 PSL Manual page 22.14 section 22.3 % A repetition may be 0 or more single items (*) or 0 or more items % separated by any token (ID[,]* parses a list of ID's separated % by ,'s. REP: ONE <'[ (ID +(#1) / '' ANYKEY +(EQTOK!-NEXT (QUOTE #1)) / ANYKEY +(EQTOK!-NEXT (QUOTE #1))) '] +(AND #2 #1) '* BLD!-EXPR / '* BLD!-EXPR>; % Create an sexpression to build a repetition. BLD!-EXPR: +(PROG (X) (SETQ X (STK!-LENGTH)) $1 (COND (#1 (GO $1))) (BUILD!-REPEAT X) (RETURN T)); ANYKEY: ANYTOK .(ADDKEY ##1) ; % Add a new KEY % One defines a single item. ONE: '' ANYKEY +(EQTOK!-NEXT (QUOTE #1)) / '@ ANYKEY .(ADDRTERM ##1) +(EQTOK (QUOTE #1)) / '@@ ANYKEY .(ADDGTERM ##1) +(EQTOK (QUOTE #1)) / '+ UNLBLD +(PUSH #1) / '. EVLBLD +(PROGN #1 T) / '= EVLBLD / '< ALT '> +(PROGN #1 T) / '( ALT ') / '+. EVLBLD +(PUSH #1) / ID +(#1) ; % This rule defines an un evaled list. It builds a list with % everything quoted. UNLBLD: '( UNLBLD ('. UNLBLD ') +(CONS #2 #1) / UNLBLD* ') +(LIST . (#2 . #1)) / ') +(LIST . #1)) / LBLD / ID +(QUOTE #1) ; % EVLBLD builds a list of evaled items. EVLBLD: '( EVLBLD ('. EVLBLD ') +(CONS #2 #1) / EVLBLD* ') +(#2 . #1) / ') ) / LBLD / ID ; LBLD: '# NUM +(EXTRACT #1) / '## NUM +(REF #1) / '$ NUM +(GENLAB #1) / '& NUM +(CADR (ASSOC #1 (CAR VARLIST))) / NUM / STR / '' ('( UNLBLD* ') +(LIST . #1) / ANYTOK +(QUOTE #1)); PSL Manual 7 February 1983 Parser Tools section 22.3 page 22.15 % Defines the pattern matching rules (PATTERN -> BODY). PRUL: .(SETQ INDEXLIST!* NIL) PAT '-> (EVLBLD)* +(LAMBDA (VARLIST T1 T2 T3) (AND . #1)) .(SETQ PNAM (GENSYM)) .(RULE!-DEFINE (LIST 'PUTD (LIST 'QUOTE PNAM) '(QUOTE EXPR) (LIST 'QUOTE #1))) +.(CONS #1 PNAM); % Defines a pattern. % We now allow the . operator to be the next to last in a (). PAT: '& ('< PSIMP[/]* '> NUM +.(PROGN (SETQ INDEXLIST!* (CONS ##1 INDEXLIST!*)) (LIST '!& #2 #1) ) / NUM +.(COND ((MEMQ ##1 INDEXLIST!*) (LIST '!& '!& #1)) (T (PROGN (SETQ INDEXLIST!* (CONS ##1 INDEXLIST!*)) (LIST '!& #1)))) ) / ID / '!( PAT* <'. PAT +.(APPEND #2 #1)> '!) / '' ANYTOK / STR / NUM ; % Defines the primitives in a pattern. PSIMP: ID / NUM / '( PSIMP* ') / '' ANYTOK; % The grammar terminator. FIN 22.3.8. The Construction of MINI 22.3.8. The Construction of MINI 22.3.8. The Construction of MINI MINI is actually described in terms of a support package for any MINI-generated parser and a self-description of MINI. The useful files (on PU: and PL:) are as follows: MINI.MIN The self definition of MINI in MINI. MINI.SL A Standard LISP version of MINI.MIN, translated by MINI itself. MINI.RED The support RLISP for MINI. MINI-PATCH.RED and MINI.FIX Some additions being tested. MINI.LAP The precompiled LAP file. Use LOAD MINI. MINI-LAP-BUILD.CTL A batch file that builds PL:MINI.LAP from the above files. MINI-SELF-BUILD.CTL A batch file that builds the MINI.SL file by loading and translating MINI.MIN. Parser Tools 7 February 1983 PSL Manual page 22.16 section 22.3 22.3.9. History of MINI Development 22.3.9. History of MINI Development 22.3.9. History of MINI Development The MINI Translator Writing System was developed in two steps. The first was the enhancement of the META/RLISP [Marti 79] system with the definition of pattern matching primitives to aid in describing and performing tree-to-tree transformations. META/RLISP is very proficient at translating an input programming language into LISP or LISP-like trees, but did not have a good method for manipulating the trees nor for direct generation of target machine code. PMETA (as it was initially called) [Kessler 79] solved these problems and created a very good environment for the development of compilers. In fact, the PMETA enhancements have been fully integrated into META/RLISP. The second step was the elimination of META/RLISP and the development of a smaller, faster system (MINI). Since META/RLISP was designed to provide maximum flexibility and full generality, the parsers that is creates are large and slow. One of its most significant problems is that it uses its own single character driven LISP functions for token scanning and recognition. Elimination of this overhead has produced a faster translator. MINI uses the hand coded scanner in the underlying RLISP. The other main aspect of MINI was the elimination of various META/RLISP features to decrease the size of the system (also decreasing the flexibility, but MINI has been successful for the various purposes in COG). MINI is now small enough to run on small LISP systems (as long as a token scanner is provided). The META/RLISP features that MINI has changed or eliminated include the following: a. The ability to backup the parser state upon failure is supported in META/RLISP. However, by modifying a grammar definition, the need for backup can be mostly avoided and was therefore eliminated from MINI. b. META/RLISP has extensive mechanisms to allow arbitrary length diphthongs. MINI only supports two character diphthongs, declared prior to their use. c. The target machine language and error specification operators are not supported because they can be implemented with support routines. d. RLISP subsyntax for specification of semantic operations is not supported (only LISP is provided). Although MINI lacks many of the features of META/RLISP, it still has been quite sufficient for a variety of languages. PSL Manual 7 February 1983 Parser Tools section 22.4 page 22.17 22.4. BNF Description of RLISP Using MINI 22.4. BNF Description of RLISP Using MINI 22.4. BNF Description of RLISP Using MINI The following formal scheme for the translation of RLISP syntax to LISP syntax is presented to eliminate misinterpretation of the definitions. We have used the above MINI syntactic form since it is close enough to BNF and has also been checked mechanically. Recall that the transformation scheme produces an S-expression corresponding to the input RLISP expression. A rule has a name by which it is known and is defined by what follows the meta symbol :. Each rule of the set consists of one or more "alternatives" separated by the meta symbol /, being the different ways in which the rule is matched by source text. Each rule ends with a ;. Each alternative is composed of a "recognizer" and a "generator". The "generator" is a MINI + expression which builds an S-expression from constants and elements loaded on the stack. The result is then loaded on the stack. The #n and ##n refer to elements loaded by MINI primitives or other rules. The "generator" is thus a template into which previously generated items are substituted. Recall that terminals in both recognizer and generator are quoted with a ' mark. This RLISP/SYSLISP syntax is based on a series of META and MINI definitions, started by R. Loos in 1970, continued by M. Griss, R. Kessler and A. Wang. [??? This MINI.RLISP grammar is a bit out of date ???] [??? This MINI.RLISP grammar is a bit out of date ???] [??? This MINI.RLISP grammar is a bit out of date ???] [??? Need to confirm for latest RLISP ???] [??? Need to confirm for latest RLISP ???] [??? Need to confirm for latest RLISP ???] mini 'rlisp; dip: !: , !<!< , !>!> , !:!= , !*!* , !<!= , !>!= , !' , !#!# ; termin: '; / '$ ; % $ used to not echo result rtermin: @; / @$ ; rlisp: ( cmds rtermin .(next!-tok) )* ; % Note explicit Scan cmds: procdef / rexpr ; %------ Procedure definition: procdef: emodeproc (ftype procs/ procs) / ftype procs / procs ; ftype: 'fexpr .(setq FTYPE!* 'fexpr) / % function type 'macro .(setq FTYPE!* 'macro) / 'smacro .(setq FTYPE!* 'smacro) / 'nmacro .(setq FTYPE!* 'nmacro) / ('expr / =T) .(setq FTYPE!* 'expr) ; Parser Tools 7 February 1983 PSL Manual page 22.18 section 22.4 emodeproc: 'syslsp .(setq EMODE!* 'syslsp)/ ('lisp/'symbolic/=T) .(setq EMODE!* 'symbolic) ; procs: 'procedure id proctail +(putd (quote #2) (quote FTYPE!* ) #1) ; proctail: '( id[,]* ') termin rexpr +(quote (lambda #2 #1)) / termin rexpr +(quote (lambda nil #1)) / id termin rexpr +(quote (lambda (#2) #1)) ; %------ Rexpr definition: rexpr: disjunction ; disjunction: conjunction (disjunctail / =T) ; disjunctail: ('or conjunction ('or conjunction)*) +.(cons 'or (cons #3 (cons #2 #1))) ; conjunction: negation (conjunctail / =T) ; conjunctail: ('and negation ('and negation)*) +.(cons (quote and) (cons #3 (cons #2 #1))) ; negation: 'not negation +(null #1) / 'null negation +(null #1) / relation ; relation: term reltail ; reltail: relop term +(#2 #2 #1) / =T ; term: ('- factor +(minus #1) / factor) termtail ; termtail: (plusop factor +(#2 #2 #1) termtail) / =T ; factor: powerexpr factortail ; factortail: (timop powerexpr +(#2 #2 #1) factortail) / =T ; powerexpr: dotexpr powtail ; powtail: ('** dotexpr +(expt #2 #1) powtail) / =T ; dotexpr: primary dottail ; dottail: ('. primary +(cons #2 #1) dottail) / =T ; primary: ifstate / groupstate / beginstate / PSL Manual 7 February 1983 Parser Tools section 22.4 page 22.19 whilestate / repeatstate / forstmts / definestate / onoffstate / lambdastate / ('( rexpr ') ) / ('' (lists / id / num) +(quote #1)) / id primtail / num ; primtail:(':= rexpr +(setq #2 #1)) / (': labstmts ) / '( actualst / (primary +(#2 #1)) / =T ; lists: '( (elements)* ') ; elements: lists / id / num ; %------ If statement: ifstate: 'if rexpr 'then rexpr elserexpr +(cond (#3 #2) (T #1)) ; elserexpr: 'else rexpr / =T +nil ; %------ While statement: whilestate: 'while rexpr 'do rexpr +(while #2 #1) ; %----- Repeat statement: repeatstate: 'repeat rexpr 'until rexpr +(repeat #2 #1) ; %---- For statement: forstmts: 'for fortail ; fortail: ('each foreachstate) / forstate ; foreachstate: id inoron rexpr actchoice rexpr +(foreach #5 #4 #3 #2 #1) ; inoron: ('in +in / 'on +on) ; actchoice: ('do +do / 'collect +collect / 'conc +conc) ; forstate: id ':= rexpr loops ; loops: (': rexpr types rexpr +(for #5 (#4 1 #3) #2 #1) ) / ('step rexpr 'until rexpr types rexpr +(for #6 (#5 #4 #3) #2 #1) ) ; types: ('do +do / 'sum +sum / 'product +product) ; Parser Tools 7 February 1983 PSL Manual page 22.20 section 22.4 %----- Function call parameter list: actualst: ') +(#1) / rexpr[,]* ') +.(cons #2 #1) ; %------ Compound group statement: groupstate: '<< rexprlist '>> +.(cons (quote progn) #1) ; %------ Compound begin-end statement: beginstate: 'begin blockbody 'end ; blockbody: decllist blockstates +.(cons (quote prog) (cons #2 #1)) ; decllist: (decls[;]* +.(flatten #1)) / (=T +nil) ; decls: ('integer / 'scalar) id[,]* ; blockstates: labstmts[;]* ; labstmts: ('return rexpr +(return #1)) / (('goto / 'go 'to) id +(go #1)) / ('if rexpr 'then labstmts blkelse +(cond (#3 #2) (T #1))) / rexpr ; blkelse: 'else labstmts / =T +nil ; rexprlist: rexpr [;]* ; lambdastate: 'lambda lamtail ; lamtail: '( id[,]* ') termin rexpr +(lambda #2 #1) / termin rexpr +(lambda nil #1) / id termin rexpr +(lambda (#2) #1) ; %------ Define statement: (id and value are put onto table % named DEFNTAB: definestate: 'define delist +.(cons (quote progn) #1) ; delist: (id '= rexpr +(put (quote #2) (quote defntab) (quote #1)))[,]* ; %------ On or off statement: onoffstate: ('on +T / 'off +nil) switchlists ; switchlists: 'defn +(set '!*defn #1) ; PSL Manual 7 February 1983 Parser Tools section 22.4 page 22.21 timop: ('* +times / '/ +quotient) ; plusop: ('+ +plus2 / '- +difference) ; relop: ('< +lessp / '<= +lep / '= +equal / '>= +gep / '> +greaterp) ; FIN