This tape includes object for c0, c1, c2.
Older C compilers will not compile the source
because (1) mutually recursive structures didn't
used to be handled right, and (2) there are
some 'long' declarations for constant conversions.
In an emergency it should't be too hard to repair
the source by hand to get it to compile.

For enthusiasts I include a description of certain aspects
of the compiler, which has changed a bit in structure.
Caution: do not believe every detail of the discussion
of the format of the intermediate language.  It has changed
slightly.  The format of the writeup is designed for nroff or troff
with the "-ms" macro package; if you don't have that,
you will have to muddle through.

The good news about this version of C is detailed below.
The bad new is that it is noticeably larger.
c1 starts off with 27400 bytes, and allocates more
for its expression trees.
The good side of the allocation strategy is that if there
is enough core then certain kinds of table overflow no longer occur.

Making the C compiler from scratch:

c0 is made by compiling c0?.c.  That's all you have
to do.  It has no assembly-language stuff and is free
of floating point.

c1 is made by running cvopt on table.s to yield
table.i, assembling table.i, and naming the a.out
table.o. (This is a combined version of all the older tables.)
Then compile c1[0-3].c together with table.o.

	cvopt table.s table.i
	as table.i
	mv a.out table.o
	cc -n -O c1*.c table.o

c2 is made just by cc c2?.c .
It is not functionally different from older c2's
except for a few bug fixes.

This makes a c1 that expects floating point hardware.
It can be loaded with the simulator for non-fp macnines.
It is probably best just to do that if you don't have
core-size problems; the speed isn't affected
since FP instructions are executed only if FP
constructions appear.
There is no flag that turns off the use of FP; to do it either
(1) catch illegal instructions by calling signal in
the main routine, directing the signal to an error-printer;
(2) find the call to atof in c11.c (I think)
and put in a fatal error message.
(1) is best since there is an occasional FP instruction
that could get executed on some initializations.

I have included some library routines.

csv is a (compatible) new version of the save sequence
that is needed to make functions returning longs work.

lseek is a long version of the seek system call that does a 32-bit
seek in one call.

itol concatenates a pair of integers to produce a long.
It is needed for example to avoid sign extension
in assigning an int to a long.
ltoi is the opposite conversion.

longops is the long */% routine.
It uses the floating point processor.
ilongops has the same specs and uses fixed point
only; if you do not have FP, install it instead.
If you like, you can split it into several pieces.

printf has been pepped up to print longs via %ld.
Also, %lo and %lx are in for octal and hex.
(Just %x prints 16-bit hex).
The old %l is now written %u for unsigned,
but if %l isn't followed by d, o, or x it acts like it used to.
Also in are %D, %O, %X for the long versions;
they are possibly preferred synonyms for %l?.

The most notable new things in this compiler are:

long integer type: say
	long a;
 or	long int a;

which are the same;  "long float" is the same as "double".

Essentially all operations on longs are implemented except that
assignment-type operators do not have values, so
l1+(l2=+l3) won't work.
Neither will l1 = l2 = 0.
Long constants are written with a terminating 'l' or 'L'.
E.g. "123L" or "0177777777L" or "0X56789abcdL".
(The latter is a hex constant, which could also have been short;
it is marked by starting with "0X".
Every fixed decimal constant larger than 32767 is taken to
be long, and so are octal or hex constants larger than
0177777 (0Xffff, or 0xFFFF if you like).
A warning is given in such a case since this is actually
an incompatibility with the older compiler.
Where the constant is just used as an initializer or
assigned to something it doesn't matter.
If it is passed to a subroutine
then the routine will not get what it expected.

Long multiply and divide call a subroutine.
A fast version which uses floating-point and a slow
version which doesn't are included.

This compiler properly handles initialization of structures
so you can say things like
	struct { char name[8]; char type; float val; } x
		{ "abc", 'a', 123.4 };

Structures of arrays, arrays of structures, and the like all work;
a more formal description of what is done follows.

<initializer> ::= <element>

<element> ::= <expression> | <element> , <element> |
                { <element> } | { <element> , }

An element is an expression or a comma-separated sequence of
elements possibly enclosed in braces.  In a brace-enclosed
sequence, a comma is optional after the last element.  This very
ambiguous definition is parsed as described below.  "Expression"
must of course be a constant expression within the previous
meaning of the Act.

An initializer for a non-structured scalar is an element with
exactly one expression in it.

An "aggregate" is a structure or an array.  If the initializer
for an aggregate begins with a left brace, then the succeeding
comma-separated sequence of elements initialize the members of
the aggregate.  It is erroneous for the number of members in the
sequence to exceed the number of elements in the aggregate.  If
the sequence has too few members the aggregate is padded.

If the initializer for an aggregate does not begin with a left
brace, then the members of the aggregate are initialized with
successive elements from the succeeding comma-separated sequence.
If the sequence terminates before the aggregate is filled the
aggregate is padded.

The "top level" initializer is the object which initializes an
external object itself, as opposed to one of its members.  The
top level initializer for an aggregate must begin with a left
brace.

If the top-level object being initialized is an array and if its
size is omitted in the declaration, e.g. "int a[]", then the size
is calculated from the number of elements which initialized it.

Short of complete assimilation of this description, there are two
simple approaches to the initialization of complicated objects.
First, observe that it is always legal to initialize any object
with a comma-separated sequence of expressions.  The members of
every structure and array are stored in a specified order, so the
expressions which initialize these members may if desired be laid
out in a row to successively, and recursively, initialize the
members.

Alternatively, the sequences of expressions which initialize
arrays or structures may uniformly be enclosed in braces.

The compiler is somewhat stickier about
some constructions that used to be accepted.
One difference is that external declarations made inside
functions are remembered to the end of the file,
that is even past the end of the function.
The most frequent problem that this causes is that
implicit declaration of a function as an integer in one
routine,
and subsequent declaration
of it as another type,
is not allowed.
This turned out to affect
several source programs
distributed with the system.

Another new thing is bit fields.
A declarator inside a structure may have the form

	<declarator> : <constant>

which specifies that the object declared is stored in a field
the number of bits in which is specified by the constant.
If several such things are stacked up next to each other
then the compiler allocates the fields from right to left,
going to the next word
when the new field will not fit.
The declarator may also have the form

	: <constant>

which allocates an unnamed field to simplify accurate
modelling of things like hardware formats where there are unused
fields.
Finally,

	: 0

means to force the next field to start on a word boundary.

The types of bit fields can be only "int" or "char".
The only difference between the two
is in the alignment and length restrictions:
no int field can be longer than 16 bits, nor any char longer
than 8 bits.
If a char field will not fit into the current character,
then it is moved up to the next character boundary.

Both int and char fields
are taken to be unsigned (non-negative)
integers.

Bit-field variables are not quite full-class citizens.
Although most operators can be applied to them,
including assignment operators,
they do not have addresses (i.e. there are no bit pointers)
so the unary & operator cannot be applied to them.
For essentially this reason there are no arrays of bit fields
variables.

There are two botches in the implementation:
in fact; a 16-bit integer field can test negative,
and addition (=+) applied to fields
can result in an overflow into the next field.
