.hnn
.pl 60
.bl 20
.ce 21
THE ML/I MACRO PROCESSOR


A Simple Introductory Guide___________________________



P.J. Brown
Computing Laboratory
University of Kent
Canterbury
Kent
England




6th impression
.da

Copyright 1970 - P.J. Brown
.hny
.bp1
.nf
.he "          A Simple Introductory Guide to the ML/I Macroprocessor"
Introduction____________
.fi
.np
ML/I is a general purpose macroprocessor that is available on several different
computers. It has been used in many different applications; probably the most
common are the extension of existing programming languages and systematic
editing.
.np
There already exists a reference manual for ML/I ("ML/I User's Manual"
4th edition, obtainable from Computing Laboratory, University of Kent) and
a paper describing ML/I ("The ML/I macro processor", Comm ACM 10,10 [Oct. 1967])
The former is, however, comparatively large and hence rather forbidding
while the latter tries to show off some of the more advanced features of 
ML/I. Hence many people have requested a short and simple introduction to ML/I
and this guide tries to satisfy the need. Obviously it has been necessary 
to omit many features and over-simplify others, but nevertheless it is still
hoped  that the reader will get a feel of what ML/I is all about, and an idea
of some of its uses.
.np
In order to show exactly what goes into ML/I and, as a result, what comes out, 
examples in this guide have been written as if the reader were using ML/I
at an on-line console where, as each line of input is typed, the resultant 
output is typed back. The examples in fact represent a sequence of lines
of input to ML/I starting from scratch. The input lines are numbered
sequentially to aid cross-referencing. Lines of output are labelled as such to
differentiate them from input lines, e.g.
.nf

	23) THIS IS A LINE OF INPUT
	op) THIS IS THE CORRESPONDING OUTPUT
.fi
.np
(Before actually using ML/I the user should find out the way it is used at 
his installation, and learn whether there are any omissions or changes to the
facilities described here.)
.nf

Basic Principles________________
.fi
.np
ML/I operates on characters, not on numbers. It is fed as input a string
of characters called the source text. There is no restriction on the form
of the source text; it may, for example, be a program in any programming
language, a scientific paper, a circular letter or some data for a program.
What ML/I does to the source text is to scan through it making systematic
replacements; the resultant text is then output. ML/I produces its output
as it goes along. Its usual sequence of operation is: read a line - perform
the necessary replacements - output the line. In some uses of ML/I 
the output is subsequently fed to some compiler or assembler.
.np
The way of making a replacement is by means of a macro. A macro is defined
by writing:
.nf

	MCDEF X AS <Y>

.fi
where X describes
what is to be replaced and Y, which can be any string of characters,
is what it is to be replaced by. Y is called the replacement text.
.np
We will now start our imaginary session with conversational ML/I
and will illustrate some simple macros.
The first two lines of input to ML/I are normally used to define some
special symbols. We will supply these without explanation
now and will refer back to them later.
.nf

	01) MCINS %.
	02) MCSKIP MT,<>

We will now define a macro:

	03) MCDEF JONES AS <SMITH>

.fi
.np
The above definition causes every subsequent occurrence of JONES to be
replaced by SMITH, for example
.nf

	04) USERS SHOULD CONSULT MR. JONES
	op) USERS SHOULD CONSULT MR. SMITH
	05) .... JONES .... JONES ....
	op) .... SMITH .... SMITH ....

.fi
.np
Similarly macros can be defined to replace punctuation characters, for example
.nf

	06) MCDEF & AS <;>
	07) X:=Y & Y:=Z&
	op) X:=Y ; Y:=Z;

.fi
.np
Every time a macro is recognised and replaced by its replacement text this
is termed a call of the macro.
.np
The normal way of using ML/I is first to feed it with the definitions
of the macros (and other similar "constructions" - see later) and then to feed
it the source text in which the replacements are to be made. However, it 
is possible to intersperse macro definitions with the rest of the source text,
as indeed is being done in this guide.
.nf

Atoms_____
.fi
.np
To show how ML/I splits up its source text, consider the following
.nf

	08) MCDEF READ AS <INPUT TO THE COMPUTER>
	09) THEN READ YOUR DATA, MR. JONES
	op) THEN INPUT TO THE COMPUTER YOUR DATA, MR. SMITH
	10) THE DREADED READER SHOULD READ
	op) THE DREADED READER SHOULD INPUT TO THE COMPUTER

.fi
.np
Note that, in the line above, the sequence 'read' occurs within DREADED and
within READER but in neither case has it been replaced. This is because
ML/I does not, in fact, scan input character by character, but rather atom
by atom, where an atom is a single punctuation character (i.e. any
character other than a letter or digit) or a sequence of letters and/or
digits bounded on each side by punctuation characters. Thus in the above
line DREADED is an atom and hence ML/I does not recognise the letters "READ"
within it.
.np
The character "space" is treated as an atom and so is the character "newline"
the latter being an imaginary character that occurs at the end of each line.
As regards this newline character, users should be careful only to use it in 
replacement text exactly where they want it, for example
.nf

	11) MCDEF INTEGER AS <FIXED (15)
	12) BINARY>
	13) BEGIN INTEGER A,B;
	op) BEGIN FIXED (15)
	op) BINARY A,B;

.fi
Note how a newline appears in the output after FIXED (15).
.nf

Multi-Atom Names________________
.fi
.np
The macros defined so far have replaced a single atom. However, it is possible
by using the keyword WITHS, to specify that a multi-atom sequence is to
be replaced. For example
.nf

	14) MCDEF THE WITHS PRIME WITHS MINISTER
	15) AS <MR. WILSON>
	16) THE PRIME MINISTER CHAIRS THE CABINET
	op) MR. WILSON CHAIRS THE CABINET
.bp
Calls within calls__________________
.fi
.np
Macro replacement takes place within the replacement text of other macros
as well as within the source text. (In fact recursion, i.e. a macro calling
itself, is allowed). For example:
.nf

	17) MCDEF THE WITHS CHAIRMAN
	18) AS <THE PRIME MINISTER OR HIS DEPUTY>
	19) THE CHAIRMAN SPEAKS FIRST
	op) MR. WILSON OR HIS DEPUTY SPEAKS FIRST

.fi
.np
Here, within the replacement text of THE CHAIRMAN, THE PRIME MINISTER
himself has been replaced.
.nf

Literal Brackets________________
.fi
.np
In all the examples of MCDEF the replacement text has been enclosed within
the characters "<" and ">". These are called literal brackets.
They mean "copy the enclosed text as it stands". Every user chooses
as his literal brackets a pair of atoms or sequences of atoms which
do not occur naturally in the text to be processed. In this guide
the literal brackets were specified in line 02 above. If it was
required to use "*(" and ")*" as literal brackets this would be done thus
.nf

	20) MCSKIP MT,* WITHS ( ) WITHS *

.fi
.np
With this definition (which has, in fact, supplemented rather than replaced
the previous literal brackets) we can now write
.nf

	21) MCDEF CUR AS *(DOG)*
	22) CUR
	op) DOG

.fi
Literal brackets have a secondary use, as illustrated by
.nf

	23) MCDEF REWIND AS <PRINT "FINISHED WITH TAPE"
	24) <REWIND>>
	25) REWIND
	op) PRINT "FINISHED WITH TAPE"
	op) REWIND

.fi
.np
The occurrence of REWIND within the replacement text of the REWIND macro
is enclosed in literal brackets (which are additional to those that enclose
the entire replacement text) to cause it to be copied literally over to
the output; if these literal brackets had been omitted it would have
been taken as a recursive call of the REWIND macro and ML/I would
have been set in an endless loop of replacing REWIND at successively
deeper levels.
.nf

Arguments_________
.fi
.np
The macros that have been defined so far have been of a rather simple
kind in that the thing to be replaced was always a single pre-defined
atom or series of atoms. Now we will consider a more complicated case.
Assume we wish to replace
.nf

	UNSTACK X;

for any X, by

	X:= STACK[POINTER];
	POINTER:=POINTER-1;
.fi
.np
Here the macro has an argument, i.e. an arbitrary string between
two predefined delimeters (in this case UNSTACK and semicolon).
Furthermore it is possible to have more than one argument, for instance
one might want to replace
.nf

	APPEND X TO LIST Y.

by

	Y[0]:=Y[0]+1;
	Y[Y[0]]:=X;
.fi
.np
This macro has three delimeters (APPEND, TO LIST and full-stop).
Delimeters are numbered 0,1,2 etc. Delimeter zero (in this
case APPEND) is called the macro name and the last delimeter is
called the closing delimeter. (In the case of a macro with no
arguments the macro name is also its closing delimeter).
.np
When a macro is defined, the delimeters are simply listed 
in the order in which they are to occur; they may be
separated by ony or more spaces or newlines. This is called a
structure representation. Thus the structure representation of the
APPEND macro is given after MCDEF in the definition
.nf

	26) MCDEF APPEND TO WITHS LIST
.fi
.np
Before specifying the replacement text of this macro
we will give some further verbal explanation.
.np
When ML/I is given a definition of a macro that has arguments,
each subsequent occurrence of the macro name in the source text
is taken as a call of the macro and ML/I then searches for the first
delimeter, then the second, and so on until it has found
the closing delimeter.
The arbitrary string occurring between delimeter i and delimeter i+1 is 
called argument i+1. Thus the first argument is argument one.
.np
It is usually necessary to insert arguments into the
replacement text of a macro and this is done by writing "%Ai."
where i is the number of the argument to be inserted. Hence,
continuing the definition of the APPEND macro
.nf

	27) AS<%A2.[0]:=%A2.[0]+1;
	28) %A2.[%A2.[0]]:=%A1.>
	29) APPEND PATIENT TO LIST WAIT.
	op) WAIT[0]:=WAIT[0]+1;
	op) WAIT[WAIT[0]]:=PATIENT;
	30) APPEND X/Y+9 TO LIST ARRAY.
	op) ARRAY[0]:=ARRAY[0]+1;
	op) ARRAY[ARRAY[0]]:=X/Y+9
.fi
.np
As a second example, the UNSTACK macro can be defined and used thus
.nf

	31) MCDEF UNSTACK;
	32) AS <%A1.:=STACK[POINTER];
	33) POINTER:=POINTER-1;>
	34) L:UNSTACK OP;
	op) L:OP:=STACK[POINTER];
	op) POINTER:=POINTER-1;
.fi
.np
It is often convenient to use the imaginary character newline as a 
delimeter, particularly as a closing one. Newlines themselves
are ignored in structure representations (hence the fact that
AS in the previous definitions started on a new line was not
significant) so when it is required to specify newline as a delimeter
it is necessary to use a keyword, namely NL. (Similarly most implementations
of ML/I have other keywords such as SPACE, SPACES and perhaps TAB). For
example
.nf

	35) MCDEF CALL NL
	36) AS <   JMS   %A1.
	37) >
	38) CALL PIG
	op)    JMS   PIG
	39) UNSTACK Y; LAB: CALL PIGGY
	op) Y:=STACK[POINTER];
	op) POINTER:=POINTER-1; LAB:   JMS   PIGGY
.fi
.np
Note that keywords apply only within structure representations. Elsewhere
NL stands for itself.
.nf


Optional delimeters___________________
.fi
.np
We will now go one step further in the elaboration of macros
and we will introduce one of the most important concepts in ML/I,
namely optional delimeters.
.np
Assume one wished to define a macro which had two alternative forms
.nf

	SET A = B + C

which should be replaced by

	LD	B
	ADD	C
	ST	A

and

	SET A = B - C

which should be replaced by

	LD	B
	SUB	C
	ST	A

.fi
i.e. delimeter two can be either a plus sign or a minus sign. Options
such as this are specified in structure representations by writing
.nf

	OPT branch 1 OR branch 2 OR .... OR branch n ALL

.fi
where each branch can itself be any structure representation.
In practice a branch is usually a single delimeter. Hence the
structure representation of the SET macro is written
.nf

	40) MCDEF SET = OPT + OR - ALL NL
.fi
.np
Within the replacement text of this macro it is necessary to
test whether delimeter two was a plus or a minus sign
and generate code accordingly. This is done thus
.nf
.bp
	41) AS <	LD	%A2.
	42) MCGO L1 IF %D2. = +
	43) 		SUB     %A3.
	44) MCGO L2
	45) %L1.        ADD     %A3.
	46) %L2.        ST     	%A1.
	47) >
.fi
.np
Two new features have been introduced here. The first is macro-time
statements, i.e. statements that are executed by ML/I when it
encounters them at the time it is macro processing.
The above example shows one such macro-time statement, the "go to"
statement, MCGO. As can be seen, MCGO has an optional
conditional clause.
.np
A second new feature is the extended use of the "%"
notation to include the insertion of delimeters (e.g. %D2.
meaning delimeter two) and to place labels that are the destination
of the MCGO statements (e.g. %L1. and %L2. above).
.np
Unlike other types of insert the "inserting" of a label does not
generate any text, i.e. it has a "null" value.
.np
We will now call SET to show that it works
.nf

	48) SET X1 = Y1 + Z1
	op)     LD         Y1
	op)     ADD        Z1
	op)     ST         X1
	49) SET BROWN = JONES - ROBINSON
	op)     LD         SMITH
	op)     SUB        ROBINSON
        op)     ST         BROWN
.fi
.np
Note how the JONES macro, defined back in line 03, is still
in existance. Its use above shows how one macro can be called
within an argument to another.
.nf

Variable numbers of arguments_____________________________
.fi
.np
We will now take the last step in the elaboration of macros
and will describe how to use macros that have a variable
number of arguments.
.np
Assume, therefore, that it is required to define a
macro called LET which is similar to the SET macro except that
it can have to the right of the equals sign an arbitrary
expression involving additions and/or subtractions. Thus
typical calls of LET might be
.nf
.bp
	LET A = B + C - D + F - G
	LET X = Y
	LET Y = X + Y + C + D
.fi
.np
When scanning a call of this macro ML/I should first encounter
LET and an equals sign. It should then search for
plus, minus or a newline. If it found either of the first two
it should recycle, i.e. again look for a plus, minus or a newline.
If it found a newline then that is the closing delimeter and thus the
replacement text could be examined.
.np
This scheme for searching for delimeters is specified
in structure representations by the use of nodes. (The word
"node" is used since the delimeter structure of a macro can
conveniently be represented as a directed graph). A node is 
"placed" at a given point in a structure representation and can
be "gone to" from the end of any branch in the same structure
representation. Nodes, which are local to the structure representation
in which they occur, are written N1, N2, N3 etc. A node is placed
just by writing its name before a delimeter specification
and is gone to simply by placing its name at the end of a branch -
note that there is no explicit "go to". Hence the structure representation
of the LET macro is written
.nf

	50) MCDEF LET = N1 OPT + N1 OR - N1 OR NL ALL
.fi
.np
Here the node is called N1 and is placed before the alternatives
plus, minus or newline. If either the plus branch or the minus
branch is taken, the branch ends by going back to N1. If, on
the other hand, the newline branch is taken the next delimeter
is taken as the one following the ALL. In this case nothing follows
the ALL so newline is the closing delimeter.
.np
We will now consider the problems that arise in specifying the
replacement text for the LET macro.
The main problem is that one does not know in advance how many arguments
there will be. The way to deal with this is to write a macro-time loop
that takes the arguments one by one until they have run out.
To do this one needs variables for counting and subscripting
and ML/I caters for this need by supplying three integer variables
T1, T2, and T3 which are local to each macro call.
(There also exist permanent global
variables P1, P2, P3, etc., but these are not of immediate interest
here).
These variables are called macro variables and ML/I contains an
assignment, MCSET, for manipulating them; MCGO can be used for
testing them. Macro variables or expressions involving them
can be used as subscripts, e.g. if T1 had value 3 then %AT1. would mean
insert argument three and %DT1-1. would mean insert delimeter two;
also, values of macro variables can be inserted as
they stand, e.g. %T1. would generate the character "3".
.np
Using these facilities the replacement text of the LET macro
is written
.nf

	51) AS<   LD   %A2.
	52) MCSET T1 = 3
	53) %L4.MCGO L2 IF %DT1-1. = +
	54) MCGO L5 UNLESS %DT1-1. = -
	55)    SUB  %AT1.
	56) MCGO L3
	57) %L2.   ADD  %AT1.
	58) %L3.MCSET T1 = T1 + 1
	59) MCGO L4
	60) %L5.   ST   %A1.
	61) >

and a sample call is

	62) LET A = B-C+D
	op)    LD   B
	op)    SUB  C
	op)    ADD  D
	op)    ST   A


Specialised example___________________
.fi
.np
As a last example we will show a macro which illustrates no new concepts
but which may be of interest to readers familiar with Polish notation.
The macro converts from fully parenthesized algebraic notation to
Polish prefix notation.
.nf

	63) MCDEF ( OPT + OR - OR * OR / ALL )
	64) AS <%D1.%A1.%A2.>
	65) (A+B)
	op) +AB
	66) ((A-(B*C))/(X/Y))
	op) /-A*BC/XY
.fi
.np
In this example the macro name is a left parenthesis and its closing
delimeter is a right parenthesis. Line 66 above shows a nested and, in fact, 
recursive call of this macro.
.nf
.bp
Skips and inserts_________________
.fi
.np
This ends the description of macros as such and we will end this guide
by clearing up a few isolated features not already covered.
Firstly we will consider inserts and skips.
.np
"Insert" is the name for the "%" facility. All that remains to be said about
this is that, in a similar manner to literal brackets, the user chooses as
his insert marker (i.e. what we have used percent for) an atom or sequence
of atoms that do not occur naturally in the text to be processed. The
insert marker is normally defined at the start of the source text, as
in line 1 above. If it was desired to use "//" as an insert marker it would
have been defined
.nf

	67) MCINS / WITH / .
.fi
.np
(WITH is similar to WITHS but means that no spaces are allowed
between the atoms it connects). In this definition a full stop is used
as the closing delimeter of the insert (as with the % sign).  The
user can vary this if he wishes.
.np
Skips have already been mentioned in that literal brackets are a special
case of a skip. A skip has, like a macro, an associated structure 
representation. When ML/I encounters the name of a skip it "switches off"
all the macros until it comes to the closing delimeter of the skip.
The only things that may be recognised within a skip are other skips
and these are only recognised if the first skip has the "M"
(for matched) option set. What ML/I does to skips
is controlled by two further options: the "D" option means copy the
delimeters to the output and the "T" option means copy the intervening
pieces of text (i.e. in macro terms, the arguments). Hence if,
for example, neither "D" nor "T" is set, the skip is totally deleted.
A skip is defined thus
.nf

	MCSKIP options, structure representation

For example

	68) MCSKIP DT, ' '

.fi
means define the quote sign as a skip name with another quote
as the closing delimeter, and set the "D" and "T" options (but
not the "M" option) for it. The following skip would, on the
other hand, be totally deleted since no options are set
.nf

	69) MCSKIP , COMMENT ;

The following examples show how these two skips work

	70) 'LET SET JONES'
	op) 'LET SET JONES'
	71) COMMENT LET IS A MACRO; 'COMMENT'
	op) 'COMMENT'


Searching for delimeters________________________
.fi
.np
If ML/I has found a macro name and is searching for
its delimeters it may encounter a nested macro call or skip.
In this case it "goes down a level" and searches for the delimeters
of the nested construction and only when the closing delimeter
of this is found does it return to the original search.
Thus if one wrote the nonsense line
.nf

	72) LET A <=> = UNSTACK L+1; + 7

.fi
the second equals and the second plus would be delimeters of the
LET macro since the first equals is within a skip and the
first plus is within a call of the UNSTACK macro. The nonsensical
output from the above has not been shown.
.nf

Operation macros________________
.fi
.np
All the built-in ML/I statements like MCDEF, MCSKIP, MCSET, MCGO,
etc., have the generic name of operation macro. Operation macros are analogous
to ordinary user-defined macros in the way they are scanned and in the way their
arguments are evaluated but are different in that they perform some predefined
system action instead of effecting a replacement. There are many operation
macros that have not been covered in this guide, including MCNOTE (which
prints a message together with the current line number), MCSUB (which
extracts a sub-string) and MCLENG (which finds the length of a string).
.nf

Replacement___________
.fi
.np
Note that macro replacement takes place everywhere
in the text scanned by ML/I except within skips. For example, it
takes place within arguments to operation macros (e.g. in
structure representations) and within inserts. In fact ML/I
contains virtually no restrictions on what one can do and
where one can do it. This in many ways contributes to the
power of ML/I but it does mean that the user is not restricted
as to the depths of the logical mires he can get himself into,
nor in the machine time he can use trying to make ML/I do
things it was not designed for.
.nf
.bp
Concluding remarks__________________
.fi
.np
We have now come to the end of this guide. Hopefully the reader has
reached a stage where, although his understanding of ML/I is of necessity 
rather patchy and, in some cases superficial,
he can still use ML/I in some simple applications and, perhaps, in some
not-so-simple ones too. After having some experience
with ML/I he may wish to refer to the User's Manual
to fill in some of the gaps in his knowledge.
.np
Among the facilities that have not been mentioned are
.nf

Warning Mode____________
.fi
.np
ML/I can run in a mode where every macro call needs to be preceded by
a special atom.
.nf

System Variables________________
.fi
.np
Built-in integer variables for controlling and monitoring some of the
facilities of ML/I.
.nf

Startlines__________
.fi
.np
Imaginary characters that can be inserted at the start of each line.
They can be defined as macro names and are useful in processing input
that is in units of lines.
.nf

Stop Markers____________
.fi
.np
An omitted delimeter can cause ML/I to search forever trying to find it.
A stop marker prevents this.
.nf

Exclusive Delimeters____________________
.fi
.np
After a macro call has been processed scanning may re-start at rather
than beyond its closing delimeter.
