NAME

cow — Parser-generator from vW2 grammar.

synopsis

wyrm::cow grammar...

description

vW2 Grammar

The vW2 grammars accepted are a subset of two level van Wijngarten grammar, with additions for lexical analysis and parser generation controls.

grammar ::= [ rule ] ...

rule ::= property-rule | metarule | hyperrule | foreign-rule

Property rules allow characteristics of the grammar and the generated parser to be specified.

property-rule ::= start-property | lookahead-property | name-property | grammar-type-property | attribute-property | glyph-property | signature-property | enum-property | reserved-symbol

Force a metanotion to be regarded as an attribute.

attribute-property ::= attribute = metanotion [ |metanotion ] ...

The conflicts property forces the LR(k) generator to issue a shift or reduction in particular shift/reduce and reduce/reduce conflicts. If conflict does have a matching conflict rule, the parser construction is aborted.

conflicts-property ::= conflicts = hypernotion : hypernotion [ | hypernotion ] ....

conflicts = proda : prodb | prodc | proddd | ...
If the conflicted symbols and productions of an inadequate state are contained in the set proda|prodb|prodc|prodd|..., the conflict is resolved to a reduction to proda. If the first hypernotion is a symbol,
conflicts = asymbol : prodb | prodc | proddd | ...
the conflict is resolved to a shift.

The glyph property allows glyphs to be renamed when literals are translated into small letters.

glyph-property ::= glyph = 'glyph' small-marks.

In the Chomsky hierarchy, a type 2 language is a context free language, and a type 3 is regular language, a language which has a grammar with no embedding recursive productions. The actual grammars can be type 1 (context sensitive) or type 0 (Turing machine equivalent), but the parsers created by cow are for either deterministic type 2 grammars (LR(k)) or type 3 (DFA). The grammar analysis can discover if it is type 2 or 3, or it can explicitly declared type 2. If a grammar is explicitly declared type 3 but is actually type 2, this rule is ignored.

grammar-type-property ::= type = 2. | type = 3.

Select the implementation langiage for the output of cow. The choices are

model
The cow parser model without further interpretation.
wyrmwif
C with wyrmwif extensions.

implementor-property ::= implementor = model. | implementor = wyrmwif.

The include property inserts the contents of URL after the full stop (.) of the include rule, or it adds foreign text that is added to the generated parser.

include-property ::= include = 'uri'. | include = [foreign-code].

The lookahead property specifies the maximum lookahead used to create the LR(k) parser. The lookahead is ignored for type 3 languages.

lookahead-property ::= k = digit....

Grammars can be named; the name appears in reports and generated parsers.

name-property ::= name = protonotion.

It is convenient to be able to define symbols that look like other symbols, but with specific spellings, the reserved symbols of the language. For example, the word 'int' in C is reserved, while 'integer' is not. While it is conceivable to define reserved symbols directly in the grammar, that adds a number of problems (ambiguity, knowing where the symbol ends without explicit %end, etc), so that it is easier to define the reserved symbol as a special spelling of another symbol.

Reserved symbol are defined with the spelling and the signature of the similar symbol it looks like. The spelling refers to the concatenation of accepted characters of the similar symbol. The start set is modified to remove the reserved symbol and add the similar symbol. Then when the similar symbol is accepted, its spelling is compared to set of reserved symbols for that state.

By default reserved symbols are searched everytime the similar symbol is accepted, whether the grammar expects the reserved symbol or not; this matches the usually expectation in most languages that a reserved symbol is reserved everywhere, even where not syntactically valid. The reserved symbol can also be defined to only be recognised if the parser needs it; this allows languages, such as PL/I, to use the same spelling as a reserved and normal symbol as syntactically permitted. This is indicated by the 'as needed' suffix.

Reserved symbols cannot have an attribute. Because the symbol is already matched to a specific spelling, this is usually superfluous. It is possible to reserve multiple spellings as one symbol; it is not possible to preserve which spelling it was. If this information is required, multiple symbols must be used instead.

reserved-symbol ::= protonotion symbol = 'characters' hypernotion symbol [ everywhere ] . | protonotion symbol = 'characters' hypernotion symbol as needed.

Declare a hypernotion signature.

signature-property ::= signature = hypernotion [ ,hypernotion ] ...

The start property identifies the start rule in the grammar. If no start rule is specified, the first hyperrule is the start rule.

start-property ::= start = hypernotion.

The stylesheet property initialises the style translations for input queue. This can be further modified while scanners are running. The style sheet indicates how XML tags, HTML tags and classes, or RTF contexts corresponds to styles of characters in the *character and except*character symbols. If a input character context does not correspond to any style, the character is silently discarded.

For example, an HTML source file may have source code in <CODE> elements and comments in <I>.

<HTML><HEAD><TITLE>Source file</TITLE></HEAD>
<BODY>
<P>Gee, do you think this could be used for literate programming?</P>
<P><CODE>int gcd(int m,int n <I>m or n can be greater</I>) {...} </CODE></P>
</BODY></HTML>

stylesheet-property ::= stylesheet = style-definition....

style-definition ::= style-name: style-match [ ,style-match... ]

style-match ::= xml-match | html-match | rtf-match | escape-match

xml-match ::= xml = xml-tag [ |xml-tag... ]

xml-tag ::= hypernotion

html-match ::= html = html-tag [ |html-tag... ]

html-tag ::= protonotion | protonotion/protonotion | /protonotion

rtf-match ::= rtf = rtf-attribute [ ,rtf-attribute... ]

rtf-attribute ::= protonotion font | plain | bold | italic | underline | outline | subscript | superscript | baseline | left | right | center | justified | digits red digits green digits blue foreground | digits red digits green digits blue background | digits size | protonotion style

escape-match ::= plain = [ 'string' [ ,'string' ] ]

XML style matches are based solely on the innermost tag. An xml-match matches if any of the listed tags is the innermost tag. HTML style matches are based either the innermost tag (protonotion), the innermost tag and its class attribute (protonotion/protonotion), or just its class attribute (/protonotion). An html-match matches if any of the listed tags and classes is the innermost tag. RTF style matches are based on the typographical and font feeatures. An rtf-match if all the attributes match simulaneously.

All characters of a plain file are assigned some style and accepted. The file starts in a plain style; transition to another style can be made with a start escape sequence of characters, and back to plain with an optional stop sequence. A newline character cannot be part of a start or stop sequence.

metarule ::= metanotion :: hypernotion [ |hypernotion ] ....

hyperrule ::= hypernotion : [ hyperalternatives ] .

hyperalternatives ::= hyperalternative | hyperalternatives; hyperalternative

hyperalternative ::= member | hyperalternative, member

member ::= hypernotion | hypernotion==hypernotion | hypernotion/=hypernotion | hypernotion=hypernotion | hypernotion­hypernotion

rewrite-rule ::= operator input-pattern: output-pattern | rewrite-rule; output-pattern

operator ::= [ protonotion :: ]

input-pattern ::= hypernotion input-subtree-patterns

input-subtree-patterns ::= <> | <input-node-patterns> | <tail-pattern> | <input-node-patterns,tail-pattern>

input-node-patterns ::= input-node-pattern | input-node-patterns,input-node-pattern

input-node-pattern ::= hypernotion input-subtree-patterns | <metanotion>

tail-pattern ::= metanotion

output-pattern ::= [ predicate, ] ... hypernotion output-subtree-patterns [ ,predicate ] ...

output-subtree-patterns ::= <> | <output-node-patterns> | <tail-pattern> | <output-node-patterns,tail-pattern>

output-node-patterns ::= output-node-pattern | output-node-patterns,output-node-pattern

output-node-pattern ::= operator hypernotion output-subtree-patterns | <operator metanotion>

predicate ::= hypernotion

If different rewrite rules have the same operator and input-pattern, they can be combined as alternatives in one rule. One rule or many have the same meaning. The input and output pattern hypernotions are reduced to signatures and can include metanotion constructors the same way as production; the hypernotion signatures can be declared with a signature property rule. The input pattern matches a tree node if the signatures are the same and the variables unify. Predicates, if any, are evaluated and can unify more variables. If the input pattern matches and all the predicates succeed, the output pattern is evaluated. The resulting output tree replaces the input to the caller.

The empty operator rewrites are only called when a node is created in a production rule, or when a node is created in an output-pattern without an operator. Non-empty operators are only called on subtrees from an output-pattern. If the operator is nonempty, it is only matches if called with the same operator. If a tree does not match any rule, with an empty or nonempty operator, it is returned unchanged.

The input-pattern matches if the signatures are the same, metanotion variables unify, and the subtree pattern matches. The node must have at least as many children as node-patterns; it can have more only if there is is a tail-pattern. A node-pattern which is hypernotion with a subsubtree pattern recursively matches the child node. If the node-pattern is <metanotion>, it matches any single node. A tail pattern matches zero or more child nodes after the enumerated nodes. All metanotion with the same spellings in the input-pattern hypernotions, output-pattern hypernotions, and predicates are unified to the same value in accordance with the uniform replacement rule (URR). Metanotions used as node-patterns are a separate namespace, and metanotions used as tail-patterns are a third separate namespace. Node-pattern and tail-patterns are subject to URR each their own namespaces; a metanotion with the same spelling in a different namespace is unrelated and does have to be (and in fact will not be) unified to the same values.

All predicates have to succeed. In the process they can define additional metanotion variables.

If the input-pattern matches and all predicates succeed, the output-pattern constructs a replacement tree; the output-pattern always succeeds. A new tree is constructed for each hypernotion output-node-pattern, and these with the unanalyzed <metanotion> node-pattern and tail-patterns are combined as children for the next level of constructed trees. When a node-pattern hypernotion is given with an empty operator, that node as created can be rewritten. If it is given a nonempty operator, that node can be rewritten by a matching operator. That can recursively trigger more rewrite rules and alter any or all of the parse tree.

The input-pattern can match one node, the node and its children, its children and some or all grandchildren, as deeply as necessary. The output tree replaces the entire matched tree; it can flatten or deepen the top of the tree; it can move child nodes around as desired and even discard them.

The node-pattern and tail-pattern metanotions are not defined by metarules, even if there is a metarule for that metanotion. They are defined by their position in the patterns.

Rewrite rules are powerful enough to implement translation and compilation: even interpretation and evaluation.

Foreign text is a way to enter programming code written in another language into the text of a grammar. Attributes can be made available for input and output to the text. As far as the context free grammar is concerned, each foreign rule is interpretted as an empty production; the foreign text is then evaluated during a reduction.

The foreign text must contain a balanced number of brackets; no escapes are available, nor are any kinds of string or comments interpretted to hide unbalanced brackets. If the foreign code must used unbalanced brackets, it must do so outside the text of the grammar.

foreign-rule ::= hypernotion : foreign-texts.

foreign-texts ::= foreign-text | foreign-texts; foreign-text

foreign-text ::= foreign-output foreign-input language [foreign-code] | language immediate [foreign-code]

foreign-output ::= [ (metanotion...) ]

foreign-input ::= [ metanotion... ]

language ::= [ protonotion ]

The language cannot end in 'immediate'

foreign-code ::= [ foreign-chunk... ]

foreign-chunk ::= any-characters-except-[-and-] | [foreign-code]

Normally in the cf parser, foreign texts are evaluated with the attributes after parsing. If the small marks 'immediate' are included after the language, means the foreign text is evaluated immediately on reduction within the parser itself. In a scanner, all foreign texts are immediate, whether marked so or not. Because immediate texts are evaluated during the parse, they can alter the lexical interpretation and other aspects of parser so that it can accept languages that it could not otherwise.

For example, C has well known ambiguity with typedef names. A C grammar can use immediate foreign texts in the parser and scanners with a rudimentary symbol table to remove this ambiguity.

NEST block:
	left brace symbol, push new typedef level,
		NEST declarations into NEST1, NEST1 statements,
		right brace symbol, pop off typedef level.
NEST typedef: typedef symbol, TAG symbol, make typedef.

IDENTIFIER symbol: TAG symbol, relabel if typedef.
TAG symbol: letter, letter or digit sequence option.

include = [
	typedef struct TopLevel TopLevel;
	struct TopLevel {int depth; char *tag; TopLevel *under;};
	TopLevel *topLevel = 0; int topLevelDepth = 0;
].
push new typedef level: immediate [
	topLevelDepth++;
].
pop new typedef level: immediate [
	topLevelDepth--;
	while (topLevel && topLevel->depth>topLevelDepth) {
		TopLevel *u = topLevel->under; free(topLevel); topLevel = u;
	}
].
relabel if typedef: immediate [
	TopLevel *t; for (t=topLevel; t; t=t->under) {
		if (strcmp(t->tag,Tcl_GetString(bufferContents()))==0) {
			PS->override = true;
			PS->reserved = 0;
			PS->symbol = TYPENAMEsymbol;
			PS->nameclass = 0;
			break
		}
	}
]
make typedef: immediate [
	TopLevel *t; t = malloc(sizeof(TopLevel));
	t->depth = topLevelDepth; t->tag = bufferString(lastlexeme.name,0);
	t->under = topLevel; topLeve = t;
]

An immediate foreign text cannot have explicit output or input variables. This is because variables are propagated by the attribute evalator after the parse is completed, but immediate texts are evaluated before the parse is completed. Immediate texts which need to communicate need to establish some protocol with global variables.

(The language string cannot end in 'immediate' unless there is another 'immediate' after it. ximmediat e[...] is an immediate text in language x; ximmediat e immediate[...] is an immediate text in language ximmediate.)

hypernotion ::= small-marks | large-marks | small-marks hypernotion | large-marks hypernotion

symbol ::= letter s symbol letter y symbol letter m symbol letter b symbol letter p symbol letter l symbol | small-marks hypernotion | large-marks hypernotion

protonotion ::= small-marks

metanotion ::= large-marks

vW2 hypernotions can contain literal glyphs written in single quotes. The glyphs are translated into small marks that are the glyph's name. From cow's point of view, there is no distinction between the literal glyphs and the small marks composing their name.

small-marks ::= ' [ glyphs ] ...'

glyph ::= any-single-character-except-' | ''

Default names are provided for the printable ascii glyphs (code 32 through 126). These names can be overridden with the glyph property.

\n newline \r return \t tab " " space ! exclaim "\"" quote # hash $ dollar % percent & ampersand '' apostrophe ( leftparen ) rightparen * asterisk + plus , comma - dash . fullstop / slash 0 zero 1 one 2 two 3 three 4 four 5 five 6 six 7 seven 8 eight 9 nine : colon ; semicolon < lessthan = equals > greaterthan ? query @ at A largea B largeb C largec D larged E largee F largef G largeg H largeh I largei J largej K largek L largel M largem N largen O largeo P largep Q largeq R larger S larges T larget U largeu V largev W largew X largex Y largey Z largez [ leftbracket \\ backslash ] rightbracket ^ circumflex _ underline @ at a lettera b letterb c letterc d letterd e lettere f letterf g letterg h letterh i letteri j letterj k letterk l letterl m letterm n lettern o lettero p letterp q letterq r letterr s letters t lettert u letteru v letterv w letterw x letterx y lettery z letterz \{ leftbrace | verticalbar \} rightbrace ~ tilde

vW2 hypernotions can contain decimal numbers which are translated to small marks. From cow's point of view, there is no distinction between the literal glyphs and the small marks composing their name.

small-marks ::= #digit-glyph...

digit-glyph ::= 0|1|2|3|4|5|6|7|8|9

The string #digit-glyph... is translated into the small marks number(digit-name...)

where digit-glyph-name ::= zero | one | two | three | four | five | six | seven | eight | nine

Generated Code Commands

The output from the parser generator is syntactical a Tcl script, but none of the commands in the script are defined. The caller evaluates this script in a context where these commands are defined. The intention is the script might be examined for its simple textual elegance, a veritable work of art, or more practically the context can define Tcl commands so that generated parser can execute from the Tcl script, or the context can define commands that in turn generate C or some other code which can then be compiled into automata code.

The generated commands are quite specific, and should be translatable easily into most imperative language. Because it is in Tcl, the translating commands can also be used as macro language.

accept_character

A character is accepted to the string buffer. If the input buffer is not empty, accept from that; otherwise accept from the input queue. The input buffer may be declared as always empty.

accept_production production-index

Record the recognised production and return from the current domain.
production-index ::= integer
New AS value.

allocate num-perm

Allocate a new environment stack frame and initialise it.
num-perm ::= integer
Number of permanent variables in the frame.

anti_get_value X.i X.j

If the registers unify, this instruction fails; if they do not unify, this instruction passes.

assign_integer variable-name value

Assign the integer value to the variable.

attribute_wam name {wam-maching}

The attribute evaluation machine with the name of the parser automata.

call L.o.proc-name/n num-args R.r

Continue execution at the L.o; the current context is saved; the CP register is set to R.r.
o ::= integer
Offset of the called procedure.
num-args ::= integer
Number of arguments passed to the procedure.
r ::= integer
Address of the next instruction.

call_domain state en

Call the state state in the same domain state.
en ::= integer
New EN value.

deallocate

Restore register values and deallocate the stack frame.

decrement_integer variable-name

Subtract one (1) to the variable's value.

deferred_action nested-action {semantics}

With partial PS construction, the parse tree and its reduction are identified before the tree is completed. Its reduced action is then deferred.

define_glyph name characters

Define the actual character for a glyph name.

define_integer variable-name initial-value

Define the variable in this context and assign the integer value to it.

define_terminal_class symbol|character classes class-symbols

Define a simple class (one integer) or an aggregate class. The implementor may use this to optimise some transition_iqis instructions.
classes ::= integer [ |integer... ]
Classes the symbols belong to.
class-symbols ::= symbol [ | symbol ... ]
Symbols in the classes.

discard_character

A character is discarded. If the input buffer is not empty, discard from that; otherwise discard from the input queue. The input buffer may be declared as always empty.

eval_loop {instructions}

Evaluate the next WAM instruction or instruction section until the machine fails or passes.

execute L.o.proc-name/n num-args

Continue execution at the procedure at L.o.r.
o ::= integer
Offset of the called procedure.
num-args ::= integer
Number of arguments passed to the procedure.

explicit_end

An explicit %end character has been recognised and is shifted.

fail

This proof is blocked from solving the query.

foreign_immediate language foreign-text

Evaluate the foreign text immediately.

foreign_include foreign-code

The foreign code is added at the beginning of the generated parser.

foreign_text language num-outputs num-inputs variable-names foreign-text

Evaluate the foreign text. The input variables are guarenteed to be bound; the output variables may or may not be bound. The implementation is expected to provide a language specific interface to extract input values from the heap, unify output variables to the heap, and to signal success or failure. cow cannot verify that interface, but perhaps the implementation of the foreign_text can. The implementation is expected to signal success to the WAM in the same manner as a get_value command.
The language is either '-' or small marks. It is passed through to the foreign_text command without interpretation. All symbol table generated foreign text have a language '%symbol':

Symbol table predicates fall into two categories: those created by symbol table mechanisms for use in the grammar, and those partially written by the grammar writer for use in the symbol table mechanism. Those created by symbol table are added to the grammar as foreign text rules with the language '%symbol'. This code intercepts these foreign text definitions from the foreign_text and splits them into individual instructions.

foreign_text %symbol 3 1 -SYMTAB -DEF -OBJECT +NAME SYMTABsymbol_table_new_definition
foreign_text %symbol 1 1 -OBJECT +NAME SYMTABsymbol_table_constant
foreign_text %symbol 0 1 +NAME SYMTABsymbol_table_not_constant
foreign_text %symbol 2 1 -OBJECT -SYMTAB +NAME SYMTABsymbol_table_identify
foreign_text %symbol 1 0 -SYMTAB SYMTABsymbol_table_new_table
foreign_text %symbol 3 0 -SYMTAB -SYMTAB0 -DEFS SYMTABsymbol_table_new_scope
foreign_text %symbol 3 0 -DEFS1-DEFS2 -DEFS SYMTABsymbol_table_split_defs
foreign_text %symbol 1 0 -DEFS SYMTABsymbol_table_empty_defs
foreign_text %symbol 3 0 -SYMTAB -SYMTAB0 -SCOPE SYMTABsymbol_table_bottom_scope

. Other languages are between the foreign text implementation and grammar writer.
num-outputs ::= integer
Number of output variables, m. The ref cells or bound values are stored in A.1 through A.m before evaluating the foreign text.
num-inputs ::= integer
Number of input variables, n. Bound values are stored in A.m+1 through A.m+n before evaluating the foreign text.
variable-names ::= {metanotion...}
m+n variable names; they are not checked for replication. These names are available to the implementing the foreign text implementation.
foreign-text ::= language-specific-text
The language specific code to be evaluated.

get_constant C.o.c A.i

Special case of get_structure against a constant.

get_constant X.i

Special case of get_structure against a list.

get_constant C.o.f/n X.i

Unify a register against a structure, initialising the traversal pointer S.

get_value X.n A.i

get_value Y.n A.i

Unify the argument register and the other register.

get_variable X.n A.i

get_variable Y.n A.i

Move A.i to X.n or Y.n.

goto_state state sequential

Continue execution at the indicated state in the same domain. If sequential is true, the next state is only enterred from the current state; the implementation may fold the next state into the current one and leave no label to it.

ibbuffer

Remove the first symbol from IQ and append it to IB.

ibclear

Completely clear the input buffer IB.

ibdiscard

Discard the first queued symbol from the input buffer (if not empty) or queue.

if_bound A.i

Succeeds if the register value is bound.

implicit_end {semantics}

The implementation should act as if an %end has been recognised; the semantics are evaluated.

increment_integer variable-name

Add one (1) to the variable's value.

initialise_constant C.o string arity

Specify the constant string for tagged offset C.o.

initialise_memory zone.o tag.value

Specify the initial value of a memory cell. Any zone can be initialised to any tagged value.

initialise_register register tag.value

Specify the initial value of a WAM register. A register can be initialised to any tagged value.

input_styles style [ |style ] ... style-sheet

A list of all styles referenced in the grammar.

iqreadahead automata n

Read enough symbols to make sure IQ has at least n symbols.
automata ::= %character|automata-name
For a lex class parsers, the automata is %character; for cf class parsers, the automata is the name of another parser, a lexical recogniser.

iq_maximum maximum-lookahead

The maximum input queue length.

name_class symbol nameclass

The symbol belongs to the indicated class.

nop

No operation: do nothing.

on_failure X.i X.j

If the register X.i value is bound, this produces an error message using the bound value and passes; if the register value is unbound, this fails. X.j is the implementation defined source file location.

on_success X.i

If the register X.i value is bound, this is a predicate which has succeeded for a production.

parser class name {parser}

class ::= lex|cf

parse_domain domain if-start {parser-state...}

A collection of states.
States are partitionned into domains, collections of states such that only one state is called. Other states in the domain are only targets of gotos and do not need to use the return stack. The initial state of a domain is the same as the domain name.
if-start ::= boolean
If this is also the start state of the parser.

parse_error productions expected-symbols

A parse occurred within these productions while expecting one of these symbols. The implementation is expected to add its current location to the error message.

parse_start state

The initial state of the parser.

parse_state domain state {transition...}

A collection of transition tests and edge semantics for the parser state.

partial_shift

If only total parse trees are used, the reduces and shifts will insert parse actions at the correct time, and nothing further needs to be done. However with partial parse trees, parse action may be inserted one place and then deferred to another; the partial tree is necessary to keep track of when to activate these deferred actions.
If every single shift and reduction had a parse action, maintaining the partial parse tree could be made ancillary to those actions. However this is not necessarily the case, especially in a scanner; hence the use of this explicit command to maintain the partial parse tree shape.

pass

The machine has found a solution to the query.

perfect_hash_entry offset spelling symbol

Define entries in the hash table.
offset ::= integer
Which slot in the table the symbol occupies.

perfect_hash_modulus table-size

A perfect hash for the reserved words has been discoverred.
table-size ::= integer
The table size. This will always be a power of two so that the modulus can be computed with a bitwise-and of m-1. The table may have up to m/2-1 empty entries.

perfect_hash_multiplier offset factor...

Given the multipliers p1, f1, ⋯, pn, fn and modulus 2m , and an input string c1c2ct , the string character codes are C [0]=k, C [i]= unicode of ci, 1it and C [i]=0, i&gt;t the hash code is h (c)=(f1C [p1]+f2C [p2]+⋯+fnC [pn]) mod2m .
offset ::= integer
factor ::= integer

proceed

Continue execution at the address in the CP register.

psbegin

Push an unlabelled partial tree.

psclear

Clear the parse stack.

psend

The top partial parse tree is completed. Its reduction semantics are evaluated.

pspush nested-action n

Push an labelled partial tree with known size. The semantics are evaluated once the tree is completed.
nested-action ::= integer
Deferred action of the semantics to do on the reduce.
n ::= integer
Number of children.

ps_maximum maximum-depth

The maximum parse stack depth if known.

put_constant C.o.c A.i

Put a constant in the register.

put_list X.i

Put a new list cell address in the register.

put_unsafe_value C.o.f/n X.i

Put a new structure cell address in the register.

put_unsafe_value Y.n A.i

Move the register, ensuring it lives on the heap.

put_value X.n A.i

put_value Y.n A.i

Move the register, X.n to A.i.

put_variable X.n A.i

put_variable Y.n A.i

Initialise the registers to an unbound cell.

put_void A.i

Put an unbound cell in the register.

query_start L.query-address {variable X.i...}

Records where the main query starts and register assignments of its variables.

recognise_reservable symbol reserved-table-index name-class filters

Search the accepted input string contents against the reserved word table. If found there, that is the symbol label. Otherwise the label is symbol. Recognised symbols of different name-classes are distinct even if spelled the same. The filters are a nested set of optional filters: uppercase | lowercase | first,<n> | transform,<hn> | warn,uppercase | warn,lowercase | warn,first | warn,transform.

recognise_symbol symbol name-class filters

Label the accepted input string with symbol. Recognised symbols of different name-classes are distinct even if spelled the same. The filters are a nested set of optional filters: uppercase | lowercase | first,<n> | transform,<hn> | warn,uppercase | warn,lowercase | warn,first | warn,transform.

reduce_parse C.o.P/1 num-elems

Create a new parse tree node from the top num-elems elements of the (environment or parse) stack and pushed the new node onto the stack. The top elements are p1, p2, ⋯, pn and the new element is <P, [p1, p2, ⋯, pn, LOC] >. The WAM environment stack is available to hold these elements, or the implementation may provide some other stack. The LOC is the implementation defined source location. Note that each subtree also has LOC field any of which can be used for the new LOC.
o ::= integer
The parse tree node name offset.
num-elems ::= integer
Number of in the node.

report severity message

Report information from the parser generator to the implementation.
severity ::= C|E|I
Critical error, Error, or Information
message ::= any-text-as-one-argument

requires iq_multiple

The implementation requires an input queue of multiple symbols.

requires iq_scanner

The implementation requires a way to handle implicit %ends.

requires iq_single

The implementation requires an input queue of one symbol.

requires ib

The implementation requires an input buffer.

requires ps_fixed

The implementation requires a parse stack of fixed size with only complete structures.

requires ps_partial

The implementation requires a parse stack with partial structures.

requires ps_total

The implementation requires a parse stack of indefinite size with only complete structures.

reserved_word_table table-index {symbol-definitions}

Define a reserved word table.

retry L.o R.r

Try the next remote alternative at L.o. If the alternative fails, continue at R.r.

retry_me_else R.o

The next, nonfinal, alternative of a procedure. R.o is the address of the next alternative, a retry_me_else or a trust_me.

section L.o {script}

A labelled section of code.

set_constant C.o.c

Put a constant in the structure.

set_local_value Y.n

Put the register in the heap, ensuring it lives on the heap.

set_value X.n

set_value Y.n

Put the register in the heap.

set_variable X.n

set_variable Y.n

Put an unbound cell in the heap and the register.

set_void num-cells

Put unbound cells in the structure.
num-cells ::= integer
Number of cells to allocate.

shift_lexeme C.o.label/1 attributed

Shift the input lexeme. If the input buffer is not empty, shift from that; otherwise shift from the input queue. The input buffer may be declared as always empty. attributed is true if the symbol has an attribute.

state_semantics {semantics...}

Semantics evaluated on state entry.

subquery_start queryname L.query-address {variable X.i...}

Records where a subquery starts and register assignments of its variables.

switch_on_constant {C.o.c L.code...}

Switch to the label if A.1 matches one of the constants.

switch_on_structure {C.o.f/n L.code...}

Switch to the label if A.1 matches one of the structures.

switch_on_term L.var L.const L.list L.structure

Switch to the label if A.1 is a variable, constant, list, or structure.

symbol_enum symbol-kind symbol

Enumerate all terminal and then nonterminal symbols.
symbol-kind ::= terminal|production

symbol_reserved spelling symbol

Define a reserved word with its exact spelling and the symbol it represents.

symbol_table_bottom_scope SYMTAB A.1 A.2 A.3

Deatch and reattach a SCOPE to the bottom of the symbol table. SYMTAB is the name of the grammar symbol table. A.1 is the new symbol table. A.2 is the old symbol table, A.3 to the scope.

symbol_table_class immediate-class {} immediate-class immediate-class class-number { class-property... }

symbol_table_class grammar-class {variable...} <TERM,signature,<VAR,variable>...> C.offset.signature/n class-number { class-property... }

The definition of an object class within a symbol table.

symbol_table_constant SYMTAB A.1 A.2

If the name is a constant object. SYMTAB is the name of the grammar symbol table. A.1 is the object. A.2 is the name.

symbol_table_definition immediate|grammar symbol-table-name { object-class-definition... }

The definition of a symbol table

symbol_table_empty_defs SYMTAB A.1

If the DEFS are empty. SYMTAB is the name of the grammar symbol table. A.1 is empty definitions.

symbol_table_identify SYMTAB class A.1 A.2 A.3

Identify an object by its name and class set. SYMTAB is the name of the grammar symbol table. A.1 is the object. A.2 is the symbol table, and A.3 to the the name.

symbol_table_method method call {variable...} {variable...}

method ::= initialiser | replacessiblingOBJECT | joinssiblingOBJECT | replacesancestorOBJECT | joinsancestorOBJECT | matchesOBJECT | constantNAME

immediate-call ::= hypernotion

grammar-call ::= {L.query-offset {variable Xi...}}

The definition of an object class method, either a hypernotion for an immediate symbol table, or a subquery call for a grammar symbol table..

symbol_table_interference distinct|conflicting {class-number...}

Sets of distinct or conflicting object classes.

symbol_table_new_definition SYMTAB A.1 A.2 A.3 A.4

Add a new definition to a the bottom symbol table scope. SYMTAB is the name of the grammar symbol table. A.1 is the SYMTAB, A.2 to DEF, A.3 to OBJECT, A.4 to NAME.

symbol_table_new_scope SYMTAB A.1 A.2 A.3

Add a new definition to a the bottom symbol table scope. SYMTAB is the name of the grammar symbol table. A.1 is the new SYMTAB. A.2 is the old symbol table. A.3 is the new definitions.

symbol_table_new_table SYMTAB A.1

Add a new definition to a the bottom symbol table scope. SYMTAB is the name of the grammar symbol table. A.1 is the SYMTAB.

symbol_table_not_constant SYMTAB A.1

If no object class can unify the name to a constant object. SYMTAB is the name of the grammar symbol table. A.1 is the name.

symbol_table_split_defs SYMTAB A.1 A.2 A.3

Partition the DEFS. SYMTAB is the name of the grammar symbol table. A.1 and A.2 combine to a partition of A.3.

term_production functor arity+1

Indicate a term is also used as a production term. It has the same variables as a term_vars, but its arity is one more and the last variable will be a list.

term_vars functor/arity {var...}

Name the variables associated with a term.

transition_character symbol match style {{character...}...}

Translation of a *character terminal. This instruction will only appear within the sub-script of a transition_iqis instruction. It informs the implementor that the symbol is single match or an except string, and what characters are in it.
If the style is not a single dash '-', then the characters must all be in the specified style.
match ::= is|isnt|is-any
is means this is a single character which must match; isnt means this is a list of lists of except strings; is-any means match any single character of that style. For is-any the character will be a single dash '-'.

transition_integer integer-test variable-name eq-values {semantics...}

Evaluate the semantics if the variable compares.
integer-test ::= po|zn|eq
If variable is positive, is negative or zero, or equals one of the eq-values.
eq-values ::= {integer...}
When integer-test is 'eq', the variable must equal one of these integers. This is always 0 for 'po' and 'ng' tests.

transition_integer_switch variable-name {semantics...}

Collects a number of transition-integer eq for optimisation as a switch.

transition_iqis offset iq-test classes class-symbols {semantics}

Evaluate the semantics if the input symbol is one of the given symbols.
offset ::= integer
Which symbol of IQ; the first is 0.
iq-test ::= is|any
Whether one symbol or set of symbols is tested.
classes ::= [ integer|integer... ]
Classes the symbols belong to. This is empty for an 'is' match. If 'any', there will be a define_terminal_class with the same classes and symbols.
class-symbols ::= symbol [ | symbol ... ]
Symbols in the classes.

transition_jump {semantics}

Unconditionally evaluate the semantics.

translator_prelude

All symbol and classes and other global information has been presented.

trim_environment num-perm

The environment stack frame can be trimmed.
num-perm ::= integer
Number of permanent variables left in the frame.

retry L.o

Try the last remote alternative at L.o.

trust_me

Begin the last alternative of a procedure.

try L.o num-args R.r

Allocate a choicepoint and try the first remote alternative at L.o. If the alternative fails, continue at R.r.
num-args ::= integer
Number of arguments to the procedure.

try_me_else R.o num-args

Begin the first alternative of a procedure. The choicepoint is allocated and initialised. R.o is the address of the next alternative, a retry_me_else or a trust_me.
num-args ::= integer
Number of arguments in the procedure.

unify_constant C.o.c X.i

Special case of unify_value against a constant.

unify_value X.i

Unify or write the register to a heap cell against the structure pointer S.

unify_value X.i

unify_value Y.i

Unify or write the register against the structure pointer S.

unify_variable X.i

unify_variable Y.i

Write or create a ref cell in the register against the structure pointer S.

unify_void num-variables

Unify variables with void names.
num-variables ::= integer
Number of void variables to unify.

wam_initialisation {initialisations}

Initialise the attribute machine.