-h- readme.txt	Thu Mar 14 13:50:42 1985	README.TXT;11

Decus CPP is a public-domain implementation of the C preprocessor.
It runs on VMS native (Vax C), VMS compatibilty mode (Decus C),
RSX-11M, RSTS/E, P/OS, and RT11, as well as on several varieties
of Unix, including Ultrix.  These notes describe how to extract
the cpp source files, configure it for your needs, and mention
a few design decisions that may be of interest to maintainers.

			Installation

Because the primary development of cpp was not on Unix, it
is distributed using the Decus C archive program (quite similar
to the archiver published in Kernighan and Plauger's Software
Tools).  To extract the files from the net.sources distribution,
save this message as cpp1.arc (and the other two distribution
files as cpp2.arc and cpp3.arc).  Then, using your favorite editor,
locate the archx.c program, just following the line beginning with
"-h- archx.c" -- the format of the tape is just:

    -h- archx.c
      ... archx.c program
    -h- archc.c
      ... archc.c program

Compile archx.c -- it shouldn't require any special editing.
Then run it as follows:

    archx cpp1.arc
    archx cpp2.arc
    archx cpp3.arc

You do not need to remove mail headers from the saved messages.

You should then read through cppdef.h to make sure the HOST and
TARGET (and other implementation-specific) definitions are set
correctly for your machine, editing them as needed.

You may then copy makefile.txt to Makefile, editing it as needed
for your particular system.  On Unix, cpp should be compiled
by make without further difficulty.  On other operating systems,
you should compile the three source modules, linking them together.
Note that, on Decus C based systems, you must extend the default
stack allocation.  The Decus C build utility will create the
appropriate command file.

			Support Notes

The distribution kit was designed to keep all submissions around
50,000 bytes:

cpp1.arc:
	readme.txt	This file
	cpp.mem		Documentation page (see below)
	archx.c		Archive extraction program
	archc.c		Archive construction program
	cpp.rno		Source for cpp.mem (see below)
	makefile.txt	Unix makefile -- copy to Makefile
	cpp.h		Main header file (structure def's and globals)
	cppdef.h	Configuration file (host and target definitions)

cpp2.arc:
	cpp1.c		Mainline code, documentation master sources
	cpp2.c		most #control processing
	cpp3.c		filename stuff and command line parsing
cpp3.arc:
	cpp4.c		#define processor
	cpp5.c		#if <expr> processor
	cpp6.c		Support code (symbol table and I/O routines)
	
cpp intentionally does not rely on the presence of a full-scale
macro preprocessor, it does require the simple parameter substitution
preprocessor capabilities of Unix V6 and Decus C.  If your C
language lacks full preprocessing, you should make sure "nomacargs"
is #define'd in cpp.h.  (This is done automatically by the Decus C
compiler.)

The documentation (manual page) for cpp is included as cpp.mem
and cpp.rno.  Cpp.rno is in Dec Runoff format, built by a Decus C
utility (getrno) from original source which is embedded in cpp1.c.
To my knowledge, there is no equivalent program that creates
the nroff source appropriate for Unix.

I would be happy to receive fixes to any problems you encounter.
As I do not maintain distribution kit base-levels, bare-bones
diff listings without sufficien context are not very useful.
It is unlikely that I can find time to help you with other
difficulties.

			Acknowledgements

I received a great deal of help from many people in debugging cpp.
Alan Feuer and Sam Kendall used "state of the art" run-time code
checkers to locate several errors.  Ed Keiser found problems when
cpp was used on machines with different int and pointer sizes.
Dave Conroy helped with the initial debugging.

Martin Minow
decvax!minow


From:	RHEA::DECWRL::"decvax!minow"    8-JAN-1985 00:38  
To:	decwrl!rhea!rex!minow
Subj:	new cpp:readme.txt

Received: from DECWRL by DEC-RHEA with SMTP; Mon,  7 Jan 85 21:39-PST
Received: by decwrl.ARPA (4.22.01/4.7.34)
	id AA07935; Mon, 7 Jan 85 21:38:18 pst
Received: by decvax.UUCP (4.12/1.0)
	id AA07664; Mon, 7 Jan 85 23:59:31 est
Date: Mon, 7 Jan 85 23:59:31 est
Return-Path: <decvax!minow>
Message-Id: <8501080459.AA07664@decvax.UUCP>



Decus cpp is a public-domain implementation of the C preprocessor.
It runs on VMS native (Vax C), VMS compatibilty mode (Decus C),
RSX-11M, RSTS/E, P/OS, and RT11, as well as on several varieties
of Unix, including Ultrix.  Decus cpp attempts to implement features
in the Draft ANSI Standard for the C language.  It should be noted,
however, that this standard is under active development:  the current
draft of the standard explicitly states that "readers are requested
not to specify or claim conformance to this draft."  Thus readers
and users of Decus cpp should not assume that it conforms to the
draft standard, or that it will conform to the actual C language
standard.

These notes describe how to extract the cpp source files, configure it
for your needs, and mention a few design decisions that may be of interest
to maintainers.

			Installation

Because the primary development of cpp was not on Unix, it
is distributed using the Decus C archive program (quite similar
to the archiver published in Kernighan and Plauger's Software
Tools).  To extract the files from the net.sources distribution,
save this message as cpp1.arc and the other two distribution
files as cpp2.arc and cpp3.arc.  Then, using your favorite editor,
locate the archx.c program, just following the line beginning with
"-h- archx.c" -- the format of the distribution is just:

    -h- readme.txt
      ... this file
    -h- cpp.mem
      ... description of cpp
    -h- archx.c
      ... archx.c program -- extracts archives
    -h- archc.c
      ... archc.c program -- creates archives

Compile archx.c -- it shouldn't require any special editing.
Then run it as follows:

    archx *.arc

You do not need to remove mail headers from the saved messages.

You should then read through cppdef.h to make sure the HOST and
TARGET (and other implementation-specific) definitions are set
correctly for your machine, editing them as needed.

You may then copy makefile.txt to Makefile, editing it as needed
for your particular system.  On Unix, cpp should be compiled
by make without further difficulty.  On other operating systems,
you should compile the six source modules, linking them together.
Note that, on Decus C based systems, you must extend the default
stack allocation.  The Decus C build utility will create the
appropriate command file.

			Support Notes

The USENET distribution kit was designed to keep all submissions around
50,000 bytes:

cpp1.arc:
	readme.txt	This file
	cpp.mem		Documentation page (see below)
	archx.c		Archive extraction program
	archc.c		Archive construction program
	cpp.rno		Source for cpp.mem (see below)
	makefile.txt	Unix makefile -- copy to Makefile
	cpp.h		Main header file (structure def's and globals)
	cppdef.h	Configuration file (host and target definitions)

cpp2.arc:
	cpp1.c		Mainline code, documentation master sources
	cpp2.c		most #control processing
	cpp3.c		filename stuff and command line parsing
cpp3.arc:
	cpp4.c		#define processor
	cpp5.c		#if <expr> processor
	cpp6.c		Support code (symbol table and I/O routines)
	
Cpp intentionally does not rely on the presence of a full-scale
macro preprocessor, it does require the simple parameter substitution
preprocessor capabilities of Unix V6 and Decus C.  If your C
language lacks full preprocessing, you should make sure "nomacargs"
is #define'd in cpp.h.  (This is done automatically by the Decus C
compiler.)

The documentation (manual page) for cpp is included as cpp.mem
and cpp.rno.  Cpp.rno is in Dec Runoff format, built by a Decus C
utility (getrno) from original source which is embedded in cpp1.c.
To my knowledge, there is no equivalent program that creates
the nroff source appropriate for Unix.

I would be happy to receive fixes to any problems you encounter.
As I do not maintain distribution kit base-levels, bare-bones
diff listings without sufficient context are not very useful.
It is unlikely that I can find time to help you with other
difficulties.

			Acknowledgements

I received a great deal of help from many people in debugging cpp.
Alan Feuer and Sam Kendall used "state of the art" run-time code
checkers to locate several errors.  Ed Keiser found problems when
cpp was used on machines with different int and pointer sizes.
Dave Conroy helped with the initial debugging, while Arthur Olsen
and George Rosenberg found (and solved) several problems in the
first USENET release.

Martin Minow
decvax!minow

-h- cpp.mem	Thu Mar 14 13:50:42 1985	CPP.MEM;143




        1.0  C Pre-Processor



                                    *******
                                    * cpp *
                                    *******



        NAME:   cpp -- C Pre-Processor

        SYNOPSIS:

                cpp [-options] [infile [outfile]]

        DESCRIPTION:

                CPP reads a C source file, expands  macros  and  include
                files,  and writes an input file for the C compiler.  If
                no file arguments are given, CPP reads  from  stdin  and
                writes  to  stdout.   If  one file argument is given, it
                will define the input file,  while  two  file  arguments
                define  both  input and output files.  The file name "-"
                is a synonym for stdin or stdout as appropriate.

                The following options are  supported.   Options  may  be
                given in either case.

                -C              If set, source-file comments are written
                                to  the  output  file.   This allows the
                                output of CPP to be used as the input to
                                a  program,  such  as lint, that expects
                                commands embedded in specially-formatted
                                comments.

                -Dname=value    Define the name  as  if  the  programmer
                                wrote

                                    #define name value

                                at the start  of  the  first  file.   If
                                "=value"  is  not  given, a value of "1"
                                will be used.

                                On non-unix systems, all alphabetic text
                                will be forced to upper-case.

                -E              Always return "success" to the operating
                                system,  even  if  errors were detected.
                                Note that some fatal errors, such  as  a
                                missing  #include  file,  will terminate
                                CPP, returning "failure" even if the  -E
                                option is given.
                                                                          Page 2
        cpp     C Pre-Processor


                -Idirectory     Add  this  directory  to  the  list   of
                                directories  searched for #include "..."
                                and #include <...> commands.  Note  that
                                there  is  no space between the "-I" and
                                the directory string.  More than one  -I
                                command   is   permitted.   On  non-Unix
                                systems   "directory"   is   forced   to
                                upper-case.

                -N              CPP  normally  predefines  some  symbols
                                defining   the   target   computer   and
                                operating system.  If -N  is  specified,
                                no symbols will be predefined.  If -N -N
                                is  specified,  the   "always   present"
                                symbols,    __LINE__,    __FILE__,   and
                                __DATE__ are not defined.

                -Stext          CPP normally assumes that  the  size  of
                                the  target  computer's  basic  variable
                                types is the same as the size  of  these
                                types  of  the host computer.  (This can
                                be  overridden  when  CPP  is  compiled,
                                however.)  The  -S option allows dynamic
                                respecification of these values.  "text"
                                is  a  string  of  numbers, separated by
                                commas, that  specifies  correct  sizes.
                                The sizes must be specified in the exact
                                order:

                                    char short int long float double

                                If you specify the option as  "-S*text",
                                pointers   to   these   types   will  be
                                specified.   -S*  takes  one  additional
                                argument  for  pointer to function (e.g.
                                int (*)())

                                For   example,    to    specify    sizes
                                appropriate  for  a  PDP-11,  you  would
                                write:

                                       c s i l f d func
                                     -S1,2,2,2,4,8,
                                    -S*2,2,2,2,2,2,2

                                Note that all values must be specified.

                -Uname          Undefine the name as if

                                    #undef name

                                were given.  On non-Unix systems, "name"
                                will be forced to upper-case.
                                                                          Page 3
        cpp     C Pre-Processor


                -Xnumber        Enable debugging code.  If no  value  is
                                given,  a value of 1 will be used.  (For
                                maintenence of CPP only.)


        PRE-DEFINED VARIABLES:

                When CPP begins processing, the following variables will
                have been defined (unless the -N option is specified):

                Target computer (as appropriate):

                    pdp11, vax, M68000 m68000 m68k

                Target operating system (as appropriate):

                    rsx, rt11, vms, unix

                Target compiler (as appropriate):

                    decus, vax11c

                The implementor may add definitions to this  list.   The
                default  definitions  match  the  definition of the host
                computer, operating system, and C compiler.

                The following are always available unless undefined  (or
                -N was specified twice):

                    __FILE__    The  input  (or  #include)  file   being
                                compiled (as a quoted string).

                    __LINE__    The line number being compiled.

                    __DATE__    The date and time of  compilation  as  a
                                Unix  ctime  quoted string (the trailing
                                newline is removed).  Thus,

                                    printf("Bug at line %s,", __LINE__);
                                    printf(" source file %s", __FILE__);
                                    printf(" compiled on %s", __DATE__);


        DRAFT PROPOSED ANSI STANDARD CONSIDERATIONS:

                The current  version  of  the  Draft  Proposed  Standard
                explicitly  states  that  "readers  are requested not to
                specify or claim conformance to this draft." Readers and
                users  of  Decus  CPP  should  not assume that Decus CPP
                conforms to the standard, or that it will conform to the
                actual C Language Standard.

                When CPP is itself compiled, many features of the  Draft
                Proposed  Standard  that  are incompatible with existing
                                                                          Page 4
        cpp     C Pre-Processor


                preprocessors may be  disabled.   See  the  comments  in
                CPP's source for details.

                The latest version of the Draft  Proposed  Standard  (as
                reflected in Decus CPP) is dated November 12, 1984.

                Comments are removed from the input text.   The  comment
                is  replaced by a single space character.  The -C option
                preserves comments, writing them to the output file.

                The '$' character is considered to be a letter.  This is
                a permitted extension.

                The following new features of C are processed by CPP:

                    #elif expression (#else #if)
                    '\xNNN' (Hexadecimal constant)
                    '\a' (Ascii BELL)
                    '\v' (Ascii Vertical Tab)
                    #if defined NAME 1 if defined, 0 if not
                    #if defined (NAME) 1 if defined, 0 if not
                    #if sizeof (basic type)
                    unary +
                    123U, 123LU Unsigned ints and longs.
                    12.3L Long double numbers
                    token#token Token concatenation
                    #include token Expands to filename

                The Draft Proposed Standard has  extended  C,  adding  a
                constant string concatenation operator, where

                    "foo" "bar"

                is regarded as the single string "foobar".   (This  does
                not  affect  CPP's  processing but does permit a limited
                form of macro argument substitution into strings as will
                be discussed.)

                The Standard Committee plans to add token  concatenation
                to  #define command lines.  One suggested implementation
                is as follows:  the sequence "Token1#Token2" is  treated
                as  if  the programmer wrote "Token1Token2".  This could
                be used as follows:

                    #line 123
                    #define ATLINE foo#__LINE__

                ATLINE would be defined as foo123.

                Note that "Token2" must either have  the  format  of  an
                identifier or be a string of digits.  Thus, the string

                    #define ATLINE foo#1x3
                                                                          Page 5
        cpp     C Pre-Processor


                generates two tokens:  "foo1" and "x3".

                If the tokens T1 and T2 are concatenated into  T3,  this
                implementation operates as follows:

                  1. Expand T1 if it is a macro.
                  2. Expand T2 if it is a macro.
                  3. Join the tokens, forming T3.
                  4. Expand T3 if it is a macro.

                A macro formal parameter  will  be  substituted  into  a
                string or character constant if it is the only component
                of that constant:

                    #define VECSIZE 123
                    #define vprint(name, size) \
                      printf("name" "[" "size" "] = {\n")
                      ... vprint(vector, VECSIZE);

                expands (effectively) to

                      vprint("vector[123] = {\n");

                Note that  this  will  be  useful  if  your  C  compiler
                supports  the  new  string concatenation operation noted
                above.  As implemented here, if you write

                    #define string(arg) "arg"
                      ... string("foo") ...

                This implementation generates  "foo",  rather  than  the
                strictly  correct  ""foo"" (which will probably generate
                an error message).  This is, strictly speaking, an error
                in CPP and may be removed from future releases.

        ERROR MESSAGES:

                Many.  CPP prints warning or error messages if  you  try
                to     use     multiple-byte     character     constants
                (non-transportable) if you #undef a symbol that was  not
                defined,  or  if  your  program  has  potentially nested
                comments.

        AUTHOR:

                Martin Minow

        BUGS:

                The #if expression processor uses signed integers  only.
                I.e, #if 0xFFFFu < 0 may be TRUE.

-h- cpp.rno	Thu Mar 14 13:50:42 1985	CPP.RNO;85
.lm 8.rm 72.nhy

.no autosubtitle .style headers 3,0,0
.pg.uc.ps 58,80.lm 8.rm 72
.hd
.hd mixed
.head mixed

.st ########cpp#####C Pre-Processor
.pg
.hl 1 ^&C Pre-Processor\&
.s 2
.c ;*******
.c ;* cpp *
.c ;*******
.s 2
.lm +8
.s.i -8;NAME:	cpp -- C Pre-Processor
.s.f
.i -8;SYNOPSIS:
.s.nf
cpp [-options] [infile [outfile]]
.s.f
.i -8;DESCRIPTION:
.s
CPP reads a C source file, expands macros and include
files, and writes an input file for the C compiler.
If no file arguments are given, CPP reads from stdin
and writes to stdout.  If one file argument is given,
it will define the input file, while two file arguments
define both input and output files.  The file name "-"
is a synonym for stdin or stdout as appropriate.
.s
The following options are supported.  Options may
be given in either case.
.lm +16
.p -16
--C		If set, source-file comments are written
to the output file.  This allows the output of CPP to be
used as the input to a program, such as lint, that expects
commands embedded in specially-formatted comments.
.p -16
--Dname=value	Define the name as if the programmer wrote
.s
.nf
    _#define name value
.s
.fill
at the start of the first file.  If "=value" is not
given, a value of "1" will be used.
.s
On non-unix systems, all alphabetic text will be forced
to upper-case.
.p -16
--E		Always return "success" to the operating
system, even if errors were detected.  Note that some fatal
errors, such as a missing _#include file, will terminate
CPP, returning "failure" even if the -E option is given.
.p -16
--Idirectory	Add this directory to the list of
directories searched for _#include "..." and _#include <...>
commands.  Note that there is no space between the
"-I" and the directory string.  More than one -I command
is permitted.  On non-Unix systems "directory" is forced
to upper-case.
.p -16
--N		CPP normally predefines some symbols defining
the target computer and operating system.  If -N is specified,
no symbols will be predefined.  If -N -N is specified, the
"always present" symbols, ____LINE____, ____FILE____, and ____DATE____
are not defined.
.p -16
--Stext		CPP normally assumes that the size of
the target computer's basic variable types is the same as the size
of these types of the host computer.  (This can be overridden
when CPP is compiled, however.)  The -S option allows dynamic
respecification of these values.  "text" is a string of
numbers, separated by commas, that specifies correct sizes.
The sizes must be specified in the exact order:
.s
.nf
    char short int long float double
.s
.fill
If you specify the option as "-S*text", pointers to these
types will be specified.  -S* takes one additional argument
for pointer to function (e.g. int (*)())
.s
For example, to specify sizes appropriate for a PDP-11,
you would write:
.s
.nf
       c s i l f d func
     -S1,2,2,2,4,8,
    -S*2,2,2,2,2,2,2
.s
.fill
Note that all values must be specified.
.p -16
--Uname		Undefine the name as if
.s
.nf
    _#undef name
.s
.fill
were given.  On non-Unix systems, "name" will be forced to
upper-case.
.p -16
--Xnumber	Enable debugging code.  If no value is
given, a value of 1 will be used.  (For maintenence of
CPP only.)
.s.lm -16
.s
.i -8;PRE-DEFINED VARIABLES:
.s
When CPP begins processing, the following variables will
have been defined (unless the -N option is specified):
.s
Target computer (as appropriate):
.s
.nf
    pdp11, vax, M68000 m68000 m68k
.fill
.s
Target operating system (as appropriate):
.s
.nf
    rsx, rt11, vms, unix
.fill
.s
Target compiler (as appropriate):
.s
.nf
    decus, vax11c
.fill
.s
The implementor may add definitions to this list.
The default definitions match the definition of the
host computer, operating system, and C compiler.
.s
The following are always available unless undefined (or
--N was specified twice):
.lm +16
.p -12
____FILE____	The input (or _#include) file being compiled
(as a quoted string).
.p -12
____LINE____	The line number being compiled.
.p -12
____DATE____	The date and time of compilation as
a Unix ctime quoted string (the trailing newline is removed).
Thus,
.s
.nf
    printf("Bug at line _%s,", ____LINE____);
    printf(" source file _%s", ____FILE____);
    printf(" compiled on _%s", ____DATE____);
.fill
.s.lm -16
.s
.i -8;DRAFT PROPOSED ANSI STANDARD CONSIDERATIONS:
.s
The current version of the Draft Proposed Standard
explicitly states that "readers are requested not to specify
or claim conformance to this draft."  Readers and users
of Decus CPP should not assume that Decus CPP conforms
to the standard, or that it will conform to the actual
C Language Standard.
.s
When CPP is itself compiled, many features of the Draft
Proposed Standard that are incompatible with existing
preprocessors may be disabled.  See the comments in CPP's
source for details.
.s
The latest version of the Draft Proposed Standard (as reflected
in Decus CPP) is dated November 12, 1984.
.s
Comments are removed from the input text.  The comment
is replaced by a single space character.  The -C option
preserves comments, writing them to the output file.
.s
The '$' character is considered to be a letter.  This is
a permitted extension.
.s
The following new features of C are processed by CPP:
.s.comment Note: significant spaces, not tabs, .br quotes #if, #elif
.br;####_#elif expression    (_#else _#if)
.br;####'_\xNNN'             (Hexadecimal constant)
.br;####'_\a'                (Ascii BELL)
.br;####'_\v'                (Ascii Vertical Tab)
.br;####_#if defined NAME    1 if defined, 0 if not
.br;####_#if defined (NAME)  1 if defined, 0 if not
.br;####_#if sizeof (basic type)
.br;####unary +
.br;####123U, 123LU          Unsigned ints and longs.
.br;####12.3L                Long double numbers
.br;####token_#token         Token concatenation
.br;####_#include token      Expands to filename
.s
The Draft Proposed Standard has extended C, adding a constant
string concatenation operator, where
.s
.nf
    "foo" "bar"
.s
.fill
is regarded as the single string "foobar".  (This does not
affect CPP's processing but does permit a limited form of
macro argument substitution into strings as will be discussed.)
.s
The Standard Committee plans to add token concatenation
to _#define command lines.  One suggested implementation
is as follows:  the sequence "Token1_#Token2" is treated
as if the programmer wrote "Token1Token2".  This could
be used as follows:
.s
.nf
    _#line 123
    _#define ATLINE foo_#____LINE____
.s
.fill
ATLINE would be defined as foo123.
.s
Note that "Token2" must either have the format of an
identifier or be a string of digits.  Thus, the string
.s
.nf
    _#define ATLINE foo_#1x3
.s
.fill
generates two tokens: "foo1" and "x3".
.s
If the tokens T1 and T2 are concatenated into T3,
this implementation operates as follows:
.s
.nf
  1. Expand T1 if it is a macro.
  2. Expand T2 if it is a macro.
  3. Join the tokens, forming T3.
  4. Expand T3 if it is a macro.
.s
.fill
A macro formal parameter will be substituted into a string
or character constant if it is the only component of that
constant:
.s
.nf
    _#define VECSIZE 123
    _#define vprint(name, size) _\
      printf("name" "[" "size" "] = {_\n")
      ... vprint(vector, VECSIZE);
.s
.fill
expands (effectively) to
.s
.nf
      vprint("vector[123] = {_\n");
.s
.fill
Note that this will be useful if your C compiler supports
the new string concatenation operation noted above.
As implemented here, if you write
.s
.nf
    _#define string(arg) "arg"
      ... string("foo") ...
.s
.fill
This implementation generates "foo", rather than the strictly
correct ""foo"" (which will probably generate an error message).
This is, strictly speaking, an error in CPP and may be removed
from future releases.
.s
.i -8;ERROR MESSAGES:
.s
Many.  CPP prints warning or error messages if you try to
use multiple-byte character constants (non-transportable)
if you _#undef a symbol that was not defined, or if your
program has potentially nested comments.
.s
.i -8;AUTHOR:
.s
Martin Minow
.s
.i -8;BUGS:
.s
The _#if expression processor uses signed integers only.
I.e, _#if 0xFFFFu < 0 may be TRUE.
.s
.lm 8.rm 72.nhy

-h- cpp.1	Thu Mar 14 13:50:42 1985	CPP.1;1
.TH CPP 1
.SH NAME
cpp \- C Pre-Processor
.SH SYNOPSIS
.B cpp
[-options] [infile [outfile]]
.SH DESCRIPTION
.I Cpp
reads a C source file, expands macros and include
files, and writes an input file for the C compiler.
If no file arguments are given,
.I cpp
reads from stdin and
writes to stdout.
If one file argument is given, it
will define the input file,
while two file arguments
define both input and output files.
.PP
The following options are supported.
Options may be given in either case.
.TP 10
.BI \-I directory
Add this directory to the list of directories searched for
#include "..."
and
#include <...> commands.
Note that there is no space between the
.B \-I
and the directory string.
More than one
.B \-I
command is permitted.
On non-Unix systems
.I directory
is forced to upper-case.
.TP 10
.BI \-D name=value
Define the name as if the programmer wrote

		#define name value

at the start of the first file.
If
.I =value
is not given,
a value of "1"
will be used.

On non-unix systems,
all alphabetic text
will be forced to upper-case.
.TP 10
.BI \-U name
Undefine the name as if

		#undef name

were given.
On non-Unix systems,
.I name
will be forced to upper-case.

The following variables are pre-defined:

Target computer (as appropriate):

	pdp11, vax, M68000 m68000 m68k

Target operating system (as appropriate):

	rsx, rt11, vms, unix

Target compiler (as appropriate):

	decus, vax11c

The implementor may add definitions to this list.
The default definitions match the definition of the host
computer, operating system, and C compiler.

The following are always available unless undefined:
.RS
.TP 10
__FILE__
The input (or #include) file being compiled (as a quoted string).
.TP 10
__LINE__
The line number being compiled.
.TP 10
__DATE__
The date and time of compilation as a Unix ctime quoted string
(the trailing newline is removed).
Thus,

   printf("Bug at line %s,", __LINE__);
   printf(" source file %s", __FILE__);
   printf(" compiled on %s", __DATE__);
.RE
.TP 10
.BI \-X number
Enable debugging code.
If no value is given,
a value of 1 will be used.
(For maintenence of CPP only.)
.SH DRAFT ANSI STANDARD CONSIDERATIONS:
.LP
Comments are removed from the input text.
The comment is replaced by a single space character.
This differs from usage on some existing preprocessors
(but it follows the Draft Ansi C Standard).
.LP
Note that arguments may be concatenated as follows:

    #define I(x)x
    #define CAT(x,y)I(x)y
    int value = CAT(1,2);
.LP
If the above macros are defined and invoked without
extraneous spaces,
they will be transportable to other
implementations.
Unfortunately,
this will not properly expand

    int CAT(foo,__LINE__);
    int CAT(foo,__LINE__);
.LP
as __LINE__ is copied into the input stream,
yielding
"foo__LINE__"
in both cases,
rather than the expected
"foo123", "foo124",
which would result if
__LINE__
were expanded and the result copied into the input stream.
.LP
Macro formal parameters are not recognized within quoted
strings and character constants in macro definitions.
.LP
CPP implements most of the ANSI draft standard.
You should be aware of the following differences:
.TP 4
o
In the draft standard,
the \\n (backslash-newline)
character is "invisible" to all processing.
In this implementation,
it is invisible to strings,
but acts as a "whitespace" (token-delimiter) outside of strings.
This considerably simplifies error message handling.
.TP 4
o
The following new features of C are processed by cpp:

    #elif expression     (#else #if)
    '\\xNNN'              (Hexadecimal constants)
    \'\\a'                 (Ascii BELL [silly])
    \'\\v'                 (Ascii VT)
    #if defined NAME     (1 if defined, 0 if not)
    #if defined (NAME)   (1 if defined, 0 if not)
    unary +              (gag me with a spoon)
.TP 4
o
The draft standard has extended C,
adding a string concatenation operator,
where

	"foo" "bar"

is regarded as the single string "foobar".
(This does not affect CPP's processing.)
.SH ERROR MESSAGES:
Many.
CPP
prints warning messages if you try to use
multiple-byte character constants (non-transportable) or
if you #undef a symbol that was not defined.
.SH BUGS:
Cpp prints spurious error or warning messages in #if
sequences such as the following:

    #define foo 0
    #if (foo != 0) ? (100 / foo) : 0
    #undef foo
    #if ((defined(foo)) ? foo : 0) == 1

Cpp
should supress the error message if the expression's
value is already known.
.SH AUTHOR:
Martin Minow
-h- makefile.txt	Thu Mar 14 13:50:42 1985	MAKEFILE.TXT;33
# Unix makefile for cpp
#
# The redefinition of strchr() and strrchr() are needed for
# Ultrix-32, Unix 4.2 bsd (and maybe some other Unices).
#
BSDDEFINE = -Dstrchr=index -Dstrrchr=rindex
#
# On certain systems, such as Unix System III, you may need to define
# $(LINTFLAGS) in the make command line to set system-specific lint flags.
#
# This Makefile assumes cpp will replace the "standard" preprocessor.
# Delete the reference to -DLINE_PREFIX=\"\" if cpp is used stand-alone.
# LINEFIX is a sed script filter that reinserts #line -- used for testing
# if LINE_PREFIX is set to "".   Note that we must stand on our heads to
# match the # and a line had better not begin with $.  By the way, what
# we really want is
#	LINEFIX = | sed "s/^#/#line/"
#
CPPDEFINE = -DLINE_PREFIX=\"\"
LINEFIX = | sed "s/^[^ !\"%-~]/&line/"
#
# Define OLD_PREPROCESSOR non-zero to make a preprocessor which is
# "as compatible as possible" with the standard Unix V7 or Ultrix
# preprocessors.  This is needed to rebuild 4.2bsd, for example, as
# the preprocessor is used to modify assembler code, rather than C.
# This is not recommended for current development.  OLD_PREPROCESSOR
# forces the following definitions:
#   OK_DOLLAR		FALSE	$ is not allowed in variables
#   OK_CONCAT		FALSE	# cannot concatenate tokens
#   COMMENT_INVISIBLE	TRUE	old-style comment concatenation
#   STRING_FORMAL	TRUE	old-style string expansion
#
OLDDEFINE = -DOLD_PREPROCESSOR=1
#
# DEFINES collects all -D arguments for cc and lint:
# Change DEFINES = $(BSDDEFINE) $(CPPDEFINE) $(OLDDEFINE)
# for an old-style preprocessor.
#
DEFINES = $(BSDDEFINE) $(CPPDEFINE)

CFLAGS = -O $(DEFINES)

#
# ** compile cpp
#
SRCS = cpp1.c cpp2.c cpp3.c cpp4.c cpp5.c cpp6.c
OBJS = cpp1.o cpp2.o cpp3.o cpp4.o cpp5.o cpp6.o
cpp: $(OBJS)
	$(CC) $(CFLAGS) $(OBJS) -o cpp

#
# ** manual page
#
man:	cpp.1
	nroff -man cpp.1 >cpp.man

#
# ** Test cpp by preprocessing itself, compiling the result,
# ** repeating the process and diff'ing the result.  Note: this
# ** is not a good test of cpp, but a simple verification.
# ** The diff's should not report any changes.
# ** Note that a sed script may be executed for each compile
#
test:
	cpp cpp1.c $(LINEFIX) >old.tmp1.c
	cpp cpp2.c $(LINEFIX) >old.tmp2.c
	cpp cpp3.c $(LINEFIX) >old.tmp3.c
	cpp cpp4.c $(LINEFIX) >old.tmp4.c
	cpp cpp5.c $(LINEFIX) >old.tmp5.c
	cpp cpp6.c $(LINEFIX) >old.tmp6.c
	$(CC) $(CFLAGS) old.tmp[123456].c
	a.out cpp1.c >new.tmp1.c
	a.out cpp2.c >new.tmp2.c
	a.out cpp3.c >new.tmp3.c
	a.out cpp4.c >new.tmp4.c
	a.out cpp5.c >new.tmp5.c
	a.out cpp6.c >new.tmp6.c
	diff old.tmp1.c new.tmp1.c
	diff old.tmp2.c new.tmp2.c
	diff old.tmp3.c new.tmp3.c
	diff old.tmp4.c new.tmp4.c
	diff old.tmp5.c new.tmp5.c
	diff old.tmp6.c new.tmp6.c
	rm a.out old.tmp[123456].* new.tmp[123456].*

#
# A somewhat more extensive test is provided by the "clock"
# program (which is not distributed).  Substitute your favorite
# macro-rich program here.
#
clock:	clock.c cpp
	cpp clock.c $(LINEFIX) >temp.cpp.c
	cc temp.cpp.c -lcurses -ltermcap -o clock
	rm temp.cpp.c

#
# ** Lint the code
#

lint:	$(SRCS)
	lint $(LINTFLAGS) $(DEFINES) $(SRCS)

#
# ** Remove unneeded files
#
clean:
	rm -f $(OBJS) cpp

#
# ** Rebuild the archive files needed to distribute cpp
# ** Uses the Decus C archive utility.
#

archc:	archc.c
	$(CC) $(CFLAGS) archc.c -o archc

archx:	archx.c
	$(CC) $(CFLAGS) archx.c -o archx

archive: archc
	archc readme.txt cpp.mem archx.c archc.c cpp.rno cpp.1 \
		makefile.txt cpp*.h >cpp1.arc
	archc cpp1.c cpp2.c cpp3.c >cpp2.arc
	archc cpp4.c cpp5.c cpp6.c >cpp3.arc

#
# Object module dependencies
#

cpp1.o	:	cpp1.c cpp.h cppdef.h

cpp2.o	:	cpp2.c cpp.h cppdef.h

cpp3.o	:	cpp3.c cpp.h cppdef.h

cpp4.o	:	cpp4.c cpp.h cppdef.h

cpp5.o	:	cpp5.c cpp.h cppdef.h

cpp6.o	:	cpp6.c cpp.h cppdef.h

cpp.man	:	cpp.1


-h- cpp.h	Thu Mar 14 13:50:42 1985	CPP.H;120

/*
 *	I n t e r n a l   D e f i n i t i o n s    f o r   C P P
 *
 * In general, definitions in this file should not be changed.
 */

#ifndef	TRUE
#define	TRUE		1
#define	FALSE		0
#endif
#ifndef	EOS
/*
 * This is predefined in Decus C
 */
#define	EOS		'\0'		/* End of string		*/
#endif
#define	EOF_CHAR	0		/* Returned by get() on eof	*/
#define NULLST		((char *) NULL)	/* Pointer to nowhere (linted)	*/
#define	DEF_NOARGS	(-1)		/* #define foo vs #define foo()	*/

/*
 * The following may need to change if the host system doesn't use ASCII.
 */
#define	DEF_MAGIC	0x1D		/* Magic for #defines		*/
#define	TOK_SEP		0x1E		/* Token concatenation delim.	*/
#define COM_SEP		0x1F		/* Magic comment separator	*/

/*
 * Note -- in Ascii, the following will map macro formals onto DEL + the
 * C1 control character region (decimal 128 .. (128 + PAR_MAC)) which will
 * be ok as long as PAR_MAC is less than 33).  Note that the last PAR_MAC
 * value is reserved for string substitution.
 */

#define	MAC_PARM	0x7F		/* Macro formals start here	*/
#if PAR_MAC >= 33
	assertion fails -- PAR_MAC isn't less than 33
#endif
#define	LASTPARM	(PAR_MAC - 1)

/*
 * Character type codes.
 */

#define	INV		0		/* Invalid, must be zero	*/
#define	OP_EOE		INV		/* End of expression		*/
#define	DIG		1		/* Digit			*/
#define	LET		2		/* Identifier start		*/
#define	FIRST_BINOP	OP_ADD
#define	OP_ADD		3
#define	OP_SUB		4
#define	OP_MUL		5
#define	OP_DIV		6
#define	OP_MOD		7
#define	OP_ASL		8
#define	OP_ASR		9
#define	OP_AND		10		/* &, not &&			*/
#define	OP_OR		11		/* |, not ||			*/
#define	OP_XOR		12
#define	OP_EQ		13
#define	OP_NE		14
#define	OP_LT		15
#define	OP_LE		16
#define	OP_GE		17
#define	OP_GT		18
#define	OP_ANA		19		/* &&				*/
#define	OP_ORO		20		/* ||				*/
#define	OP_QUE		21		/* ?				*/
#define	OP_COL		22		/* :				*/
#define	OP_CMA		23		/* , (relevant?)		*/
#define	LAST_BINOP	OP_CMA		/* Last binary operand		*/
/*
 * The following are unary.
 */
#define	FIRST_UNOP	OP_PLU		/* First Unary operand		*/
#define	OP_PLU		24		/* + (draft ANSI standard)	*/
#define	OP_NEG		25		/* -				*/
#define	OP_COM		26		/* ~				*/
#define	OP_NOT		27		/* !				*/
#define	LAST_UNOP	OP_NOT
#define	OP_LPA		28		/* (				*/
#define	OP_RPA		29		/* )				*/
#define	OP_END		30		/* End of expression marker	*/
#define	OP_MAX		(OP_END + 1)	/* Number of operators		*/
#define	OP_FAIL		(OP_END + 1)	/* For error returns		*/

/*
 * The following are for lexical scanning only.
 */

#define	QUO		65		/* Both flavors of quotation	*/
#define	DOT		66		/* . might start a number	*/
#define	SPA		67		/* Space and tab		*/
#define	BSH		68		/* Just a backslash		*/
#define	END		69		/* EOF				*/

/*
 * These bits are set in ifstack[]
 */
#define	WAS_COMPILING	1		/* TRUE if compile set at entry	*/
#define	ELSE_SEEN	2		/* TRUE when #else processed	*/
#define	TRUE_SEEN	4		/* TRUE when #if TRUE processed	*/

/*
 * Define bits for the basic types and their adjectives
 */

#define	T_CHAR		  1
#define	T_INT		  2
#define	T_FLOAT		  4
#define	T_DOUBLE	  8
#define	T_SHORT		 16
#define	T_LONG		 32
#define	T_SIGNED	 64
#define	T_UNSIGNED	128
#define	T_PTR		256		/* Pointer			*/
#define	T_FPTR		512		/* Pointer to functions		*/

/*
 * The DEFBUF structure stores information about #defined
 * macros.  Note that the defbuf->repl information is always
 * in malloc storage.
 */

typedef struct defbuf {
	struct defbuf	*link;		/* Next define in chain	*/
	char		*repl;		/* -> replacement	*/
	int		hash;		/* Symbol table hash	*/
	int		nargs;		/* For define(args)	*/
	char		name[1];	/* #define name		*/
} DEFBUF;

/*
 * The FILEINFO structure stores information about open files
 * and macros being expanded.
 */

typedef struct fileinfo {
	char		*bptr;		/* Buffer pointer	*/
	int		line;		/* for include or macro	*/
	FILE		*fp;		/* File if non-null	*/
	struct fileinfo	*parent;	/* Link to includer	*/
	char		*filename;	/* File/macro name	*/
	char		*progname;	/* From #line statement	*/
	unsigned int	unrecur;	/* For macro recursion	*/
	char		buffer[1];	/* current input line	*/
} FILEINFO;

/*
 * The SIZES structure is used to store the values for #if sizeof
 */

typedef struct sizes {
    short	bits;			/* If this bit is set,		*/
    short	size;			/* this is the datum size value	*/
    short	psize;			/* this is the pointer size	*/
} SIZES;
/*
 * nomacarg is a built-in #define on Decus C.
 */

#ifdef	nomacarg
#define	cput		output		/* cput concatenates tokens	*/
#else
#if COMMENT_INVISIBLE
#define	cput(c)		{ if (c != TOK_SEP && c != COM_SEP) putchar(c); }
#else
#define	cput(c)		{ if (c != TOK_SEP) putchar(c); }
#endif
#endif

#ifndef	nomacarg
#define	streq(s1, s2)	(strcmp(s1, s2) == 0)
#endif

/*
 * Error codes.  VMS uses system definitions.
 * Decus C codes are defined in stdio.h.
 * Others are cooked to order.
 */

#if HOST == SYS_VMS
#include		<ssdef.h>
#include		<stsdef.h>
#define	IO_NORMAL	(SS$_NORMAL | STS$M_INHIB_MSG)
#define	IO_ERROR	SS$_ABORT
#endif
/*
 * Note: IO_NORMAL and IO_ERROR are defined in the Decus C stdio.h file
 */
#ifndef	IO_NORMAL
#define	IO_NORMAL	0
#endif
#ifndef	IO_ERROR
#define	IO_ERROR	1
#endif

/*
 * Externs
 */

extern int	line;			/* Current line number		*/
extern int	wrongline;		/* Force #line to cc pass 1	*/
extern char	type[];			/* Character classifier		*/
extern char	token[IDMAX + 1];	/* Current input token		*/
extern int	instring;		/* TRUE if scanning string	*/
extern int	inmacro;		/* TRUE if scanning #define	*/
extern int	errors;			/* Error counter		*/
extern int	recursion;		/* Macro depth counter		*/
extern char	ifstack[BLK_NEST];	/* #if information		*/
#define	compiling ifstack[0]
extern char	*ifptr;			/* -> current ifstack item	*/
extern char	*incdir[NINCLUDE];	/* -i directories		*/
extern char	**incend;		/* -> active end of incdir	*/
extern int	cflag;			/* -C option (keep comments)	*/
extern int	eflag;			/* -E option (ignore errors)	*/
extern int	nflag;			/* -N option (no pre-defines)	*/
extern int	rec_recover;		/* unwind recursive macros	*/
extern char	*preset[];		/* Standard predefined symbols	*/
extern char	*magic[];		/* Magic predefined symbols	*/
extern FILEINFO	*infile;		/* Current input file		*/
extern char	work[NWORK + 1];	/* #define scratch		*/
extern char	*workp;			/* Free space in work		*/
#if	DEBUG
extern int	debug;			/* Debug level			*/
#endif
extern int	keepcomments;		/* Don't remove comments if set	*/
extern SIZES	size_table[];		/* For #if sizeof sizes		*/
extern char	*getmem();		/* Get memory or die.		*/
extern DEFBUF	*lookid();		/* Look for a #define'd thing	*/
extern DEFBUF	*defendel();		/* Symbol table enter/delete	*/
extern char	*savestring();		/* Stuff string in malloc mem.	*/
extern char	*strcpy();
extern char	*strcat();
extern char	*strrchr();
extern char	*strchr();
extern long	time();
extern char	*sprintf();		/* Lint needs this		*/
-h- cppdef.h	Thu Mar 14 13:50:42 1985	CPPDEF.H;35
/*
 *		   S y s t e m   D e p e n d e n t
 *		D e f i n i t i o n s    f o r   C P P
 *
 * Definitions in this file may be edited to configure CPP for particular
 * host operating systems and target configurations.
 *
 * NOTE: cpp assumes it is compiled by a compiler that supports macros
 * with arguments.  If this is not the case (as for Decus C), #define
 * nomacarg -- and provide function equivalents for all macros.
 *
 * cpp also assumes the host and target implement the Ascii character set.
 * If this is not the case, you will have to do some editing here and there.
 */

/*
 * This redundant definition of TRUE and FALSE works around
 * a limitation of Decus C.
 */
#ifndef	TRUE
#define	TRUE			1
#define	FALSE			0
#endif

/*
 * Define the HOST operating system.  This is needed so that
 * cpp can use appropriate filename conventions.
 */
#define	SYS_UNKNOWN		0
#define	SYS_UNIX		1
#define	SYS_VMS			2
#define	SYS_RSX			3
#define	SYS_RT11		4
#define	SYS_LATTICE		5
#define	SYS_ONYX		6
#define	SYS_68000		7

#ifndef	HOST
#ifdef	unix
#define	HOST			SYS_UNIX
#else
#ifdef	vms
#define	HOST			SYS_VMS
#else
#ifdef	rsx
#define	HOST			SYS_RSX
#else
#ifdef	rt11
#define	HOST			SYS_RT11
#endif
#endif
#endif
#endif
#endif

#ifndef	HOST
#define	HOST			SYS_UNKNOWN
#endif

/*
 * We assume that the target is the same as the host system
 */
#ifndef	TARGET
#define	TARGET			HOST
#endif

/*
 * In order to predefine machine-dependent constants,
 * several strings are defined here:
 *
 * MACHINE	defines the target cpu (by name)
 * SYSTEM	defines the target operating system
 * COMPILER	defines the target compiler
 *
 *	The above may be #defined as "" if they are not wanted.
 *	They should not be #defined as NULL.
 *
 * LINE_PREFIX	defines the # output line prefix, if not "line"
 *		This should be defined as "" if cpp is to replace
 *		the "standard" C pre-processor.
 *
 * FILE_LOCAL	marks functions which are referenced only in the
 *		file they reside.  Some C compilers allow these
 *		to be marked "static" even though they are referenced
 *		by "extern" statements elsewhere.
 *
 * OK_DOLLAR	Should be set TRUE if $ is a valid alphabetic character
 *		in identifiers (default), or zero if $ is invalid.
 *		Default is TRUE.
 *
 * OK_CONCAT	Should be set TRUE if # may be used to concatenate
 *		tokens in macros (per the Ansi Draft Standard) or
 *		FALSE for old-style # processing (needed if cpp is
 *		to process assembler source code).
 *
 * OK_DATE	Predefines the compilation date if set TRUE.
 *		Not permitted by the Nov. 12, 1984 Draft Standard.
 *
 * S_CHAR etc.	Define the sizeof the basic TARGET machine word types.
 *		By default, sizes are set to the values for the HOST
 *		computer.  If this is inappropriate, see the code in
 *		cpp3.c for details on what to change.  Also, if you
 *		have a machine where sizeof (signed int) differs from
 *		sizeof (unsigned int), you will have to edit code and
 *		tables in cpp3.c (and extend the -S option definition.)
 *
 * CPP_LIBRARY	May be defined if you have a site-specific include directory
 *		which is to be searched *before* the operating-system
 *		specific directories.
 */

#if TARGET == SYS_LATTICE
/*
 * We assume the operating system is pcdos for the IBM-PC.
 * We also assume the small model (just like the PDP-11)
 */
#define MACHINE			"i8086"
#define	SYSTEM			"pcdos"
#endif

#if TARGET == SYS_ONYX
#define	MACHINE			"z8000"
#define	SYSTEM			"unix"
#endif

#if TARGET == SYS_VMS
#define	MACHINE			"vax"
#define	SYSTEM			"vms"
#define	COMPILER		"vax11c"
#endif

#if TARGET == SYS_RSX
#define	MACHINE			"pdp11"
#define	SYSTEM			"rsx"
#define	COMPILER		"decus"
#endif

#if TARGET == SYS_RT11
#define	MACHINE			"pdp11"
#define	SYSTEM			"rt11"
#define	COMPILER		"decus"
#endif

#if TARGET == SYS_68000
/*
 * All three machine designators have been seen in various systems.
 * Warning -- compilers differ as to sizeof (int).  cpp3 assumes that
 * sizeof (int) == 2
 */
#define	MACHINE			"M68000", "m68000", "m68k"
#define	SYSTEM			"unix"
#endif

#if	TARGET == SYS_UNIX
#define	SYSTEM			"unix"
#ifdef	pdp11
#define	MACHINE			"pdp11"
#endif
#ifdef	vax
#define	MACHINE			"vax"
#endif
#endif

/*
 * defaults
 */

#ifndef MSG_PREFIX
#define MSG_PREFIX		"cpp: "
#endif

#ifndef LINE_PREFIX
#ifdef	decus
#define	LINE_PREFIX		""
#else
#define LINE_PREFIX		"line"
#endif
#endif

/*
 * OLD_PREPROCESSOR forces the definition of OK_DOLLAR, OK_CONCAT,
 * COMMENT_INVISIBLE, and STRING_FORMAL to values appropriate for
 * an old-style preprocessor.
 */
 
#ifndef	OLD_PREPROCESSOR
#define	OLD_PREPROCESSOR	FALSE
#endif

#if	OLD_PREPROCESSOR
#define	OK_DOLLAR		FALSE
#define	OK_CONCAT		FALSE
#define	COMMENT_INVISIBLE	TRUE
#define	STRING_FORMAL		TRUE
#endif

/*
 * RECURSION_LIMIT may be set to -1 to disable the macro recursion test.
 */
#ifndef	RECURSION_LIMIT
#define	RECURSION_LIMIT	1000
#endif

/*
 * BITS_CHAR may be defined to set the number of bits per character.
 * it is needed only for multi-byte character constants.
 */
#ifndef	BITS_CHAR
#define	BITS_CHAR		8
#endif

/*
 * BIG_ENDIAN is set TRUE on machines (such as the IBM 360 series)
 * where 'ab' stores 'a' in the high-bits and 'b' in the low-bits.
 * It is set FALSE on machines (such as the PDP-11 and Vax-11)
 * where 'ab' stores 'a' in the low-bits and 'b' in the high-bits.
 * (Or is it the other way around?) -- Warning: BIG_ENDIAN code is untested.
 */
#ifndef	BIG_ENDIAN
#define	BIG_ENDIAN 		FALSE
#endif

/*
 * COMMENT_INVISIBLE may be defined to allow "old-style" comment
 * processing, whereby the comment becomes a zero-length token
 * delimiter.  This permitted tokens to be concatenated in macro
 * expansions.  This was removed from the Draft Ansi Standard.
 */
#ifndef	COMMENT_INVISIBLE
#define	COMMENT_INVISIBLE	FALSE
#endif

/*
 * STRING_FORMAL may be defined to allow recognition of macro parameters
 * anywhere in replacement strings.  This was removed from the Draft Ansi
 * Standard and a limited recognition capability added.
 */
#ifndef	STRING_FORMAL
#define	STRING_FORMAL		FALSE
#endif

/*
 * OK_DOLLAR enables use of $ as a valid "letter" in identifiers.
 * This is a permitted extension to the Ansi Standard and is required
 * for e.g., VMS, RSX-11M, etc.   It should be set FALSE if cpp is
 * used to preprocess assembler source on Unix systems.  OLD_PREPROCESSOR
 * sets OK_DOLLAR FALSE for that reason.
 */
#ifndef	OK_DOLLAR
#define	OK_DOLLAR		TRUE
#endif

/*
 * OK_CONCAT enables (one possible implementation of) token concatenation.
 * If cpp is used to preprocess Unix assembler source, this should be
 * set FALSE as the concatenation character, #, is used by the assembler.
 */
#ifndef	OK_CONCAT
#define	OK_CONCAT		TRUE
#endif

/*
 * OK_DATE may be enabled to predefine today's date as a string
 * at the start of each compilation.  This is apparently not permitted
 * by the Draft Ansi Standard.
 */
#ifndef	OK_DATE
#define	OK_DATE		TRUE
#endif

/*
 * Some common definitions.
 */

#ifndef	DEBUG
#define	DEBUG			FALSE
#endif

/*
 * The following definitions are used to allocate memory for
 * work buffers.  In general, they should not be modified
 * by implementors.
 *
 * PAR_MAC	The maximum number of #define parameters (31 per Standard)
 *		Note: we need another one for strings.
 * IDMAX	The longest identifier, 31 per Ansi Standard
 * NBUFF	Input buffer size
 * NWORK	Work buffer size -- the longest macro
 *		must fit here after expansion.
 * NEXP		The nesting depth of #if expressions
 * NINCLUDE	The number of directories that may be specified
 *		on a per-system basis, or by the -I option.
 * BLK_NEST	The number of nested #if's permitted.
 */

#define	IDMAX			 31
#define	PAR_MAC		   (31 + 1)
#define	NBUFF			512
#define	NWORK			512
#define	NEXP			128
#define	NINCLUDE		  7
#define	NPARMWORK		(NWORK * 2)
#define	BLK_NEST		32

/*
 * Some special constants.  These may need to be changed if cpp
 * is ported to a wierd machine.
 *
 * NOTE: if cpp is run on a non-ascii machine, ALERT and VT may
 * need to be changed.  They are used to implement the proposed
 * ANSI standard C control characters '\a' and '\v' only.
 * DEL is used to tag macro tokens to prevent #define foo foo
 * from looping.  Note that we don't try to prevent more elaborate
 * #define loops from occurring.
 */

#ifndef	ALERT
#define	ALERT			'\007'		/* '\a' is "Bell"	*/
#endif

#ifndef	VT
#define	VT			'\013'		/* Vertical Tab CTRL/K	*/
#endif


#ifndef	FILE_LOCAL
#ifdef	decus
#define	FILE_LOCAL		static
#else
#ifdef	vax11c
#define	FILE_LOCAL		static
#else
#define	FILE_LOCAL				/* Others are global	*/
#endif
#endif
#endif
-h- cpp1.c	Thu Mar 14 13:50:42 1985	CPP1.C;214
/*
 * CPP main program.
 *
 * Edit history
 * 21-May-84	MM	"Field test" release
 * 23-May-84	MM	Some minor hacks.
 * 30-May-84	ARF	Didn't get enough memory for __DATE__
 *			Added code to read stdin if no input
 *			files are provided.
 * 29-Jun-84	MM	Added ARF's suggestions, Unixifying cpp.
 * 11-Jul-84	MM	"Official" first release (that's what I thought!)
 * 22-Jul-84	MM/ARF/SCK Fixed line number bugs, added cpp recognition
 *			of #line, fixed problems with #include.
 * 23-Jul-84	MM	More (minor) include hacking, some documentation.
 *			Also, redid cpp's #include files
 * 25-Jul-84	MM	#line filename isn't used for #include searchlist
 *			#line format is <number> <optional name>
 * 25-Jul-84	ARF/MM	Various bugs, mostly serious.  Removed homemade doprint
 * 01-Aug-84	MM	Fixed recursion bug, remove extra newlines and
 *			leading whitespace from cpp output.
 * 02-Aug-84	MM	Hacked (i.e. optimized) out blank lines and unneeded
 *			whitespace in general.  Cleaned up unget()'s.
 * 03-Aug-84	Keie	Several bug fixes from Ed Keizer, Vrije Universitet.
 *			-- corrected arg. count in -D and pre-defined
 *			macros.  Also, allow \n inside macro actual parameter
 *			lists.
 * 06-Aug-84	MM	If debugging, dump the preset vector at startup.
 * 12-Aug-84	MM/SCK	Some small changes from Sam Kendall
 * 15-Aug-84	Keie/MM	cerror, cwarn, etc. take a single string arg.
 *			cierror, etc. take a single int. arg.
 *			changed LINE_PREFIX slightly so it can be
 *			changed in the makefile.
 * 31-Aug-84	MM	USENET net.sources release.
 *  7-Sep-84	SCH/ado Lint complaints
 * 10-Sep-84	Keie	Char's can't be signed in some implementations
 * 11-Sep-84	ado	Added -C flag, pathological line number fix
 * 13-Sep-84	ado	Added -E flag (does nothing) and "-" file for stdin.
 * 14-Sep-84	MM	Allow # 123 as a synonym for #line 123
 * 19-Sep-84	MM	scanid always reads to token, make sure #line is
 *			written to a new line, even if -C switch given.
 *			Also, cpp - - reads stdin, writes stdout.
 * 03-Oct-84	ado/MM	Several changes to line counting and keepcomments
 *			stuff.  Also a rewritten control() hasher -- much
 *			simpler and no less "perfect". Note also changes
 *			in cpp3.c to fix numeric scanning.
 * 04-Oct-84	MM	Added recognition of macro formal parameters if
 *			they are the only thing in a string, per the
 *			draft standard.
 * 08-Oct-84	MM	One more attack on scannumber
 * 15-Oct-84	MM/ado	Added -N to disable predefined symbols.  Fixed
 *			linecount if COMMENT_INVISIBLE enabled.
 * 22-Oct-84	MM	Don't evaluate the #if/#ifdef argument if
 *			compilation is supressed.  This prevents
 *			unnecessary error messages in sequences such as
 *			    #ifdef FOO		-- undefined
 *			    #if FOO == 10	-- shouldn't print warning
 * 25-Oct-84	MM	Fixed bug in false ifdef supression.  On vms,
 *			#include <foo> should open foo.h -- this duplicates
 *			the behavior of Vax-C
 * 31-Oct-84	ado/MM	Parametized $ in indentifiers.  Added a better
 *			token concatenator and took out the trial
 *			concatenation code.  Also improved #ifdef code
 *			and cleaned up the macro recursion tester.
 *  2-Nov-84	MM/ado	Some bug fixes in token concatenation, also
 *			a variety of minor (uninteresting) hacks.
 *  6-Nov-84	MM	Happy Birthday.  Broke into 4 files and added
 *			#if sizeof (basic_types)
 *  9-Nov-84	MM	Added -S* for pointer type sizes
 * 13-Nov-84	MM	Split cpp1.c, added vms defaulting
 * 23-Nov-84	MM/ado	-E supresses error exit, added CPP_INCLUDE,
 *			fixed strncpy bug.
 *  3-Dec-84	ado/MM	Added OLD_PREPROCESSOR
 *  7-Dec-84	MM	Stuff in Nov 12 Draft Standard
 * 17-Dec-84	george	Fixed problems with recursive macros
 * 17-Dec-84	MM	Yet another attack on #if's (f/t)level removed.
 * 07-Jan-85	ado	Init defines before doing command line options
 *			so -Uunix works.
 * 14-Jan-85	MM	Fixed bug with logical device translation on VMS.
 * 18-Jan-85	MM	Rearrainged fgetname() conditionals.
 */

/*)BUILD
	$(PROGRAM)	= cpp
	$(FILES)	= { cpp1 cpp2 cpp3 cpp4 cpp5 cpp6 }
	$(INCLUDE)	= { cppdef.h cpp.h }
	$(STACK)	= 2000
	$(TKBOPTIONS)	= {
		STACK	= 2000
	}
*/

#ifdef	DOCUMENTATION

title	cpp		C Pre-Processor
index			C pre-processor

synopsis
	.s.nf
	cpp [-options] [infile [outfile]]
	.s.f
description

	CPP reads a C source file, expands macros and include
	files, and writes an input file for the C compiler.
	If no file arguments are given, CPP reads from stdin
	and writes to stdout.  If one file argument is given,
	it will define the input file, while two file arguments
	define both input and output files.  The file name "-"
	is a synonym for stdin or stdout as appropriate.

	The following options are supported.  Options may
	be given in either case.
	.lm +16
	.p -16
	-C		If set, source-file comments are written
	to the output file.  This allows the output of CPP to be
	used as the input to a program, such as lint, that expects
	commands embedded in specially-formatted comments.
	.p -16
	-Dname=value	Define the name as if the programmer wrote

	    #define name value

	at the start of the first file.  If "=value" is not
	given, a value of "1" will be used.

	On non-unix systems, all alphabetic text will be forced
	to upper-case.
	.p -16
	-E		Always return "success" to the operating
	system, even if errors were detected.  Note that some fatal
	errors, such as a missing #include file, will terminate
	CPP, returning "failure" even if the -E option is given.
	.p -16
	-Idirectory	Add this directory to the list of
	directories searched for #include "..." and #include <...>
	commands.  Note that there is no space between the
	"-I" and the directory string.  More than one -I command
	is permitted.  On non-Unix systems "directory" is forced
	to upper-case.
	.p -16
	-N		CPP normally predefines some symbols defining
	the target computer and operating system.  If -N is specified,
	no symbols will be predefined.  If -N -N is specified, the
	"always present" symbols, __LINE__, __FILE__, and __DATE__
	are not defined.
	.p -16
	-Stext		CPP normally assumes that the size of
	the target computer's basic variable types is the same as the size
	of these types of the host computer.  (This can be overridden
	when CPP is compiled, however.)  The -S option allows dynamic
	respecification of these values.  "text" is a string of
	numbers, separated by commas, that specifies correct sizes.
	The sizes must be specified in the exact order:

	    char short int long float double

	If you specify the option as "-S*text", pointers to these
	types will be specified.  -S* takes one additional argument
	for pointer to function (e.g. int (*)())

	For example, to specify sizes appropriate for a PDP-11,
	you would write:

	       c s i l f d func
	     -S1,2,2,2,4,8,
	    -S*2,2,2,2,2,2,2

	Note that all values must be specified.
	.p -16
	-Uname		Undefine the name as if

	    #undef name

	were given.  On non-Unix systems, "name" will be forced to
	upper-case.
	.p -16
	-Xnumber	Enable debugging code.  If no value is
	given, a value of 1 will be used.  (For maintenence of
	CPP only.)
	.s.lm -16

Pre-Defined Variables

	When CPP begins processing, the following variables will
	have been defined (unless the -N option is specified):
	.s
	Target computer (as appropriate):
	.s
	    pdp11, vax, M68000 m68000 m68k
	.s
	Target operating system (as appropriate):
	.s
	    rsx, rt11, vms, unix
	.s
	Target compiler (as appropriate):
	.s
	    decus, vax11c
	.s
	The implementor may add definitions to this list.
	The default definitions match the definition of the
	host computer, operating system, and C compiler.
	.s
	The following are always available unless undefined (or
	-N was specified twice):
	.lm +16
	.p -12
	__FILE__	The input (or #include) file being compiled
	(as a quoted string).
	.p -12
	__LINE__	The line number being compiled.
	.p -12
	__DATE__	The date and time of compilation as
	a Unix ctime quoted string (the trailing newline is removed).
	Thus,
	.s
	    printf("Bug at line %s,", __LINE__);
	    printf(" source file %s", __FILE__);
	    printf(" compiled on %s", __DATE__);
	.s.lm -16

Draft Proposed Ansi Standard Considerations

	The current version of the Draft Proposed Standard
	explicitly states that "readers are requested not to specify
	or claim conformance to this draft."  Readers and users
	of Decus CPP should not assume that Decus CPP conforms
	to the standard, or that it will conform to the actual
	C Language Standard.

	When CPP is itself compiled, many features of the Draft
	Proposed Standard that are incompatible with existing
	preprocessors may be disabled.  See the comments in CPP's
	source for details.

	The latest version of the Draft Proposed Standard (as reflected
	in Decus CPP) is dated November 12, 1984.

	Comments are removed from the input text.  The comment
	is replaced by a single space character.  The -C option
	preserves comments, writing them to the output file.

	The '$' character is considered to be a letter.  This is
	a permitted extension.

	The following new features of C are processed by CPP:
	.s.comment Note: significant spaces, not tabs, .br quotes #if, #elif
	.br;####_#elif expression    (_#else _#if)
	.br;####'_\xNNN'             (Hexadecimal constant)
	.br;####'_\a'                (Ascii BELL)
	.br;####'_\v'                (Ascii Vertical Tab)
	.br;####_#if defined NAME    1 if defined, 0 if not
	.br;####_#if defined (NAME)  1 if defined, 0 if not  
	.br;####_#if sizeof (basic type)
	.br;####unary +
	.br;####123U, 123LU          Unsigned ints and longs.
	.br;####12.3L                Long double numbers
	.br;####token_#token         Token concatenation
	.br;####_#include token      Expands to filename

	The Draft Proposed Standard has extended C, adding a constant
	string concatenation operator, where

	    "foo" "bar"

	is regarded as the single string "foobar".  (This does not
	affect CPP's processing but does permit a limited form of
	macro argument substitution into strings as will be discussed.)

	The Standard Committee plans to add token concatenation
	to #define command lines.  One suggested implementation
	is as follows:  the sequence "Token1#Token2" is treated
	as if the programmer wrote "Token1Token2".  This could
	be used as follows:

	    #line 123
	    #define ATLINE foo#__LINE__

	ATLINE would be defined as foo123.

	Note that "Token2" must either have the format of an
	identifier or be a string of digits.  Thus, the string

	    #define ATLINE foo#1x3

	generates two tokens: "foo1" and "x3".

	If the tokens T1 and T2 are concatenated into T3,
	this implementation operates as follows:

	  1. Expand T1 if it is a macro.
	  2. Expand T2 if it is a macro.
	  3. Join the tokens, forming T3.
	  4. Expand T3 if it is a macro.

	A macro formal parameter will be substituted into a string
	or character constant if it is the only component of that
	constant:

	    #define VECSIZE 123
	    #define vprint(name, size) \
	      printf("name" "[" "size" "] = {\n")
	      ... vprint(vector, VECSIZE);

	expands (effectively) to

	      vprint("vector[123] = {\n");

	Note that this will be useful if your C compiler supports
	the new string concatenation operation noted above.
	As implemented here, if you write

	    #define string(arg) "arg"
	      ... string("foo") ...

	This implementation generates "foo", rather than the strictly
	correct ""foo"" (which will probably generate an error message).
	This is, strictly speaking, an error in CPP and may be removed
	from future releases.

error messages

	Many.  CPP prints warning or error messages if you try to
	use multiple-byte character constants (non-transportable)
	if you #undef a symbol that was not defined, or if your
	program has potentially nested comments.

author

	Martin Minow

bugs

	The #if expression processor uses signed integers only.
	I.e, #if 0xFFFFu < 0 may be TRUE.

#endif

#include	<stdio.h>
#include	<ctype.h>
#include	"cppdef.h"
#include	"cpp.h"

/*
 * Commonly used global variables:
 * line		is the current input line number.
 * wrongline	is set in many places when the actual output
 *		line is out of sync with the numbering, e.g,
 *		when expanding a macro with an embedded newline.
 *
 * token	holds the last identifier scanned (which might
 *		be a candidate for macro expansion).
 * errors	is the running cpp error counter.
 * infile	is the head of a linked list of input files (extended by
 *		#include and macros being expanded).  infile always points
 *		to the current file/macro.  infile->parent to the includer,
 *		etc.  infile->fd is NULL if this input stream is a macro.
 */
int		line;			/* Current line number		*/
int		wrongline;		/* Force #line to compiler	*/
char		token[IDMAX + 1];	/* Current input token		*/
int		errors;			/* cpp error counter		*/
FILEINFO	*infile = NULL;		/* Current input file		*/
#if DEBUG
int		debug;			/* TRUE if debugging now	*/
#endif
/*
 * This counter is incremented when a macro expansion is initiated.
 * If it exceeds a built-in value, the expansion stops -- this tests
 * for a runaway condition:
 *	#define X Y
 *	#define Y X
 *	X
 * This can be disabled by falsifying rec_recover.  (Nothing does this
 * currently: it is a hook for an eventual invocation flag.)
 */
int		recursion;		/* Infinite recursion counter	*/
int		rec_recover = TRUE;	/* Unwind recursive macros	*/

/*
 * instring is set TRUE when a string is scanned.  It modifies the
 * behavior of the "get next character" routine, causing all characters
 * to be passed to the caller (except <DEF_MAGIC>).  Note especially that
 * comments and \<newline> are not removed from the source.  (This
 * prevents cpp output lines from being arbitrarily long).
 *
 * inmacro is set by #define -- it absorbs comments and converts
 * form-feed and vertical-tab to space, but returns \<newline>
 * to the caller.  Strictly speaking, this is a bug as \<newline>
 * shouldn't delimit tokens, but we'll worry about that some other
 * time -- it is more important to prevent infinitly long output lines.
 *
 * instring and inmarcor are parameters to the get() routine which
 * were made global for speed.
 */
int		instring = FALSE;	/* TRUE if scanning string	*/
int		inmacro = FALSE;	/* TRUE if #defining a macro	*/

/*
 * work[] and workp are used to store one piece of text in a temporay
 * buffer.  To initialize storage, set workp = work.  To store one
 * character, call save(c);  (This will fatally exit if there isn't
 * room.)  To terminate the string, call save(EOS).  Note that
 * the work buffer is used by several subroutines -- be sure your
 * data won't be overwritten.  The extra byte in the allocation is
 * needed for string formal replacement.
 */
char		work[NWORK + 1];	/* Work buffer			*/
char		*workp;			/* Work buffer pointer		*/

/*
 * keepcomments is set TRUE by the -C option.  If TRUE, comments
 * are written directly to the output stream.  This is needed if
 * the output from cpp is to be passed to lint (which uses commands
 * embedded in comments).  cflag contains the permanent state of the
 * -C flag.  keepcomments is always falsified when processing #control
 * commands and when compilation is supressed by a false #if
 *
 * If eflag is set, CPP returns "success" even if non-fatal errors
 * were detected.
 *
 * If nflag is non-zero, no symbols are predefined except __LINE__.
 * __FILE__, and __DATE__.  If nflag > 1, absolutely no symbols
 * are predefined.
 */
int		keepcomments = FALSE;	/* Write out comments flag	*/
int		cflag = FALSE;		/* -C option (keep comments)	*/
int		eflag = FALSE;		/* -E option (never fail)	*/
int		nflag = 0;		/* -N option (no predefines)	*/

/*
 * ifstack[] holds information about nested #if's.  It is always
 * accessed via *ifptr.  The information is as follows:
 *	WAS_COMPILING	state of compiling flag at outer level.
 *	ELSE_SEEN	set TRUE when #else seen to prevent 2nd #else.
 *	TRUE_SEEN	set TRUE when #if or #elif succeeds
 * ifstack[0] holds the compiling flag.  It is TRUE if compilation
 * is currently enabled.  Note that this must be initialized TRUE.
 */
char		ifstack[BLK_NEST] = { TRUE };	/* #if information	*/
char		*ifptr = ifstack;		/* -> current ifstack[] */

/*
 * incdir[] stores the -i directories (and the system-specific
 * #include <...> directories.
 */
char	*incdir[NINCLUDE];		/* -i directories		*/
char	**incend = incdir;		/* -> free space in incdir[]	*/

/*
 * This is the table used to predefine target machine and operating
 * system designators.  It may need hacking for specific circumstances.
 * Note: it is not clear that this is part of the Ansi Standard.
 * The -N option supresses preset definitions.
 */
char	*preset[] = {			/* names defined at cpp start	*/
#ifdef	MACHINE
	MACHINE,
#endif
#ifdef	SYSTEM
	SYSTEM,
#endif
#ifdef	COMPILER
	COMPILER,
#endif
#if	DEBUG
	"decus_cpp",			/* Ourselves!			*/
#endif
	NULL				/* Must be last			*/
};

/*
 * The value of these predefined symbols must be recomputed whenever
 * they are evaluated.  The order must not be changed.
 */
char	*magic[] = {			/* Note: order is important	*/
	"__LINE__",
	"__FILE__",
	NULL				/* Must be last			*/
};

main(argc, argv)
int		argc;
char		*argv[];
{
	register int	i;

#if HOST == SYS_VMS
	argc = getredirection(argc, argv);	/* vms >file and <file	*/
#endif
	initdefines();				/* O.S. specific def's	*/
	i = dooptions(argc, argv);		/* Command line -flags	*/
	switch (i) {
	case 3:
	    /*
	     * Get output file, "-" means use stdout.
	     */
	    if (!streq(argv[2], "-")) {
#if HOST == SYS_VMS
		/*
		 * On vms, reopen stdout with "vanilla rms" attributes.
		 */
		if ((i = creat(argv[2], 0, "rat=cr", "rfm=var")) == -1
		 || dup2(i, fileno(stdout)) == -1) {
#else
		if (freopen(argv[2], "w", stdout) == NULL) {
#endif
		    perror(argv[2]);
		    cerror("Can't open output file \"%s\"", argv[2]);
		    exit(IO_ERROR);
		}
	    }				/* Continue by opening input	*/
	case 2:				/* One file -> stdin		*/
	    /*
	     * Open input file, "-" means use stdin.
	     */
	    if (!streq(argv[1], "-")) {
		if (freopen(argv[1], "r", stdin) == NULL) {
		    perror(argv[1]);
		    cerror("Can't open input file \"%s\"", argv[1]);
		    exit(IO_ERROR);
		}
		strcpy(work, argv[1]);	/* Remember input filename	*/
		break;
	    }				/* Else, just get stdin		*/
	case 0:				/* No args?			*/
	case 1:				/* No files, stdin -> stdout	*/
#if HOST == SYS_VMS || HOST == SYS_RSX || HOST == SYS_RT11
	    fgetname(stdin, work);	/* Vax-11C, Decus C know name	*/
#else
	    work[0] = EOS;		/* Unix can't find stdin name	*/
#endif
	    break;

	default:
	    exit(IO_ERROR);		/* Can't happen			*/
	}
	setincdirs();			/* Setup -I include directories	*/
	addfile(stdin, work);		/* "open" main input file	*/
#if DEBUG
	if (debug > 0)
	    dumpdef("preset #define symbols");
#endif
	cppmain();			/* Process main file		*/
	if ((i = (ifptr - &ifstack[0])) != 0) {
#if OLD_PREPROCESSOR
	    ciwarn("Inside #ifdef block at end of input, depth = %d", i);
#else
	    cierror("Inside #ifdef block at end of input, depth = %d", i);
#endif
	}
	fclose(stdout);
	if (errors > 0) {
	    fprintf(stderr, (errors == 1)
		? "%d error in preprocessor\n"
		: "%d errors in preprocessor\n", errors);
	    if (!eflag)
		exit(IO_ERROR);
	}
	exit(IO_NORMAL);		/* No errors or -E option set	*/
}

FILE_LOCAL
cppmain()
/*
 * Main process for cpp -- copies tokens from the current input
 * stream (main file, include file, or a macro) to the output
 * file.
 */
{
	register int		c;		/* Current character	*/
	register int		counter;	/* newlines and spaces	*/
	extern int		output();	/* Output one character	*/

	/*
	 * Explicitly output a #line at the start of cpp output so
	 * that lint (etc.) knows the name of the original source
	 * file.  If we don't do this explicitly, we may get
	 * the name of the first #include file instead.
	 */
	sharp();
	/*
	 * This loop is started "from the top" at the beginning of each line
	 * wrongline is set TRUE in many places if it is necessary to write
	 * a #line record.  (But we don't write them when expanding macros.)
	 *
	 * The counter variable has two different uses:  at
	 * the start of a line, it counts the number of blank lines that
	 * have been skipped over.  These are then either output via
	 * #line records or by outputting explicit blank lines.
 	 * When expanding tokens within a line, the counter remembers
	 * whether a blank/tab has been output.  These are dropped
	 * at the end of the line, and replaced by a single blank
	 * within lines.
	 */
	for (;;) {
	    counter = 0;			/* Count empty lines	*/
	    for (;;) {				/* For each line, ...	*/
		while (type[(c = get())] == SPA) /* Skip leading blanks	*/
		    ;				/* in this line.	*/
		if (c == '\n')			/* If line's all blank,	*/
		    ++counter;			/* Do nothing now	*/
		else if (c == '#') {		/* Is 1st non-space '#'	*/
		    keepcomments = FALSE;	/* Don't pass comments	*/
		    counter = control(counter);	/* Yes, do a #command	*/
		    keepcomments = (cflag && compiling);
		}
		else if (c == EOF_CHAR)		/* At end of file?	*/
		    break;
		else if (!compiling) {		/* #ifdef false?	*/
		    skipnl();			/* Skip to newline	*/
		    counter++;			/* Count it, too.	*/
		}
		else {
		    break;			/* Actual token		*/
		}
	    }
	    if (c == EOF_CHAR)			/* Exit process at	*/
		break;				/* End of file		*/
	    /*
	     * If the loop didn't terminate because of end of file, we
	     * know there is a token to compile.  First, clean up after
	     * absorbing newlines.  counter has the number we skipped.
	     */
	    if ((wrongline && infile->fp != NULL) || counter > 4)
		sharp();			/* Output # line number	*/
	    else {				/* If just a few, stuff	*/
		while (--counter >= 0)		/* them out ourselves	*/
		    putchar('\n');
	    }
	    /*
	     * Process each token on this line.
	     */
	    unget();				/* Reread the char.	*/
	    for (;;) {				/* For the whole line,	*/
		do {				/* Token concat. loop	*/
		    for (counter = 0; (type[(c = get())] == SPA);) {
#if COMMENT_INVISIBLE
			if (c != COM_SEP)
			    counter++;
#else
			counter++;		/* Skip over blanks	*/
#endif
		    }
		    if (c == EOF_CHAR || c == '\n')
			goto end_line;		/* Exit line loop	*/
		    else if (counter > 0)	/* If we got any spaces	*/
			putchar(' ');		/* Output one space	*/
		    c = macroid(c);		/* Grab the token	*/
		} while (type[c] == LET && catenate());
		if (c == EOF_CHAR || c == '\n')	/* From macro exp error	*/
		    goto end_line;		/* Exit line loop	*/
		switch (type[c]) {
		case LET:
		    fputs(token, stdout);	/* Quite ordinary token	*/
		    break;


		case DIG:			/* Output a number	*/
		case DOT:			/* Dot may begin floats	*/
		    scannumber(c, output);
		    break;

		case QUO:			/* char or string const	*/
		    scanstring(c, output);	/* Copy it to output	*/
		    break;

		default:			/* Some other character	*/
		    cput(c);			/* Just output it	*/
		    break;
		}				/* Switch ends		*/
	    }					/* Line for loop	*/
end_line:   if (c == '\n') {			/* Compiling at EOL?	*/
		putchar('\n');			/* Output newline, if	*/
		if (infile->fp == NULL)		/* Expanding a macro,	*/
		    wrongline = TRUE;		/* Output # line later	*/
	    }
	}					/* Continue until EOF	*/
}

output(c)
int		c;
/*
 * Output one character to stdout -- output() is passed as an
 * argument to scanstring()
 */
{
#if COMMENT_INVISIBLE
	if (c != TOK_SEP && c != COM_SEP)
#else
	if (c != TOK_SEP)
#endif
	    putchar(c);
}

static char	*sharpfilename = NULL;

FILE_LOCAL
sharp()
/*
 * Output a line number line.
 */
{
	register char		*name;

	if (keepcomments)			/* Make sure # comes on	*/
	    putchar('\n');			/* a fresh, new line.	*/
	printf("#%s %d", LINE_PREFIX, line);
	if (infile->fp != NULL) {
	    name = (infile->progname != NULL)
		? infile->progname : infile->filename;
	    if (sharpfilename == NULL
	     || sharpfilename != NULL && !streq(name, sharpfilename)) {
		if (sharpfilename != NULL)
		    free(sharpfilename);
		sharpfilename = savestring(name);
		printf(" \"%s\"", name);
	     }
	}
	putchar('\n');
	wrongline = FALSE;
}
-h- cpp2.c	Thu Mar 14 13:50:42 1985	CPP2.C;43
/*
 *				C P P 2 . C
 *
 *			   Process #control lines
 *
 * Edit history
 * 13-Nov-84	MM	Split from cpp1.c
 */

#include	<stdio.h>
#include	<ctype.h>
#include	"cppdef.h"
#include	"cpp.h"
#if HOST == SYS_VMS
/*
 * Include the rms stuff.  (We can't just include rms.h as it uses the
 * VaxC-specific library include syntax that Decus CPP doesn't support.
 * By including things by hand, we can CPP ourself.)
 */
#include	<nam.h>
#include	<fab.h>
#include	<rab.h>
#include	<rmsdef.h>
#endif

/*
 * Generate (by hand-inspection) a set of unique values for each control
 * operator.  Note that this is not guaranteed to work for non-Ascii
 * machines.  CPP won't compile if there are hash conflicts.
 */

#define	L_assert	('a' + ('s' << 1))
#define	L_define	('d' + ('f' << 1))
#define	L_elif		('e' + ('i' << 1))
#define	L_else		('e' + ('s' << 1))
#define	L_endif		('e' + ('d' << 1))
#define	L_if		('i' + (EOS << 1))
#define	L_ifdef		('i' + ('d' << 1))
#define	L_ifndef	('i' + ('n' << 1))
#define	L_include	('i' + ('c' << 1))
#define	L_line		('l' + ('n' << 1))
#define	L_nogood	(EOS + (EOS << 1))	/* To catch #i		*/
#define	L_pragma	('p' + ('a' << 1))
#define L_undef		('u' + ('d' << 1))
#if DEBUG
#define	L_debug		('d' + ('b' << 1))	/* #debug		*/
#define	L_nodebug	('n' + ('d' << 1))	/* #nodebug		*/
#endif

int
control(counter)
int		counter;	/* Pending newline counter		*/
/*
 * Process #control lines.  Simple commands are processed inline,
 * while complex commands have their own subroutines.
 *
 * The counter is used to force out a newline before #line, and
 * #pragma commands.  This prevents these commands from ending up at
 * the end of the previous line if cpp is invoked with the -C option.
 */
{
	register int		c;
	register char		*tp;
	register int		hash;
	char			*ep;

	c = skipws();
	if (c == '\n' || c == EOF_CHAR)
	    return (counter + 1);
	if (!isdigit(c))
	    scanid(c);			/* Get #word to token[]		*/
	else {
	    unget();			/* Hack -- allow #123 as a	*/
	    strcpy(token, "line");	/* synonym for #line 123	*/
	}
	hash = (token[1] == EOS) ? L_nogood : (token[0] + (token[2] << 1));
	switch (hash) {
	case L_assert:	tp = "assert";		break;
	case L_define:	tp = "define";		break;
	case L_elif:	tp = "elif";		break;
	case L_else:	tp = "else";		break;
	case L_endif:	tp = "endif";		break;
	case L_if:	tp = "if";		break;
	case L_ifdef:	tp = "ifdef";		break;
	case L_ifndef:	tp = "ifndef";		break;
	case L_include:	tp = "include";		break;
	case L_line:	tp = "line";		break;
	case L_pragma:	tp = "pragma";		break;
	case L_undef:	tp = "undef";		break;
#if DEBUG
	case L_debug:	tp = "debug";		break;
	case L_nodebug:	tp = "nodebug";		break;
#endif
	default:	hash = L_nogood;
	case L_nogood:	tp = "";		break;
	}
	if (!streq(tp, token))
	    hash = L_nogood;
	/*
	 * hash is set to a unique value corresponding to the
	 * control keyword (or L_nogood if we think it's nonsense).
	 */
	if (infile->fp == NULL)
	    cwarn("Control line \"%s\" within macro expansion", token);
	if (!compiling) {			/* Not compiling now	*/
	    switch (hash) {
	    case L_if:				/* These can't turn	*/
	    case L_ifdef:			/*  compilation on, but	*/
	    case L_ifndef:			/*   we must nest #if's	*/
		if (++ifptr >= &ifstack[BLK_NEST])
		    goto if_nest_err;
		*ifptr = 0;			/* !WAS_COMPILING	*/
	    case L_line:			/* Many			*/
	    /*
	     * Are pragma's always processed?
	     */
	    case L_pragma:			/*  options		*/
	    case L_include:			/*   are uninteresting	*/
	    case L_define:			/*    if we		*/
	    case L_undef:			/*     aren't		*/
	    case L_assert:			/*      compiling.	*/
dump_line:	skipnl();			/* Ignore rest of line	*/
		return (counter + 1);
	    }
	}
	/*
	 * Make sure that #line and #pragma are output on a fresh line.
	 */
	if (counter > 0 && (hash == L_line || hash == L_pragma)) {
	    putchar('\n');
	    counter--;
	}
	switch (hash) {
	case L_line:
	    /*
	     * Parse the line to update the line number and "progname"
	     * field and line number for the next input line.
	     * Set wrongline to force it out later.
	     */
	    c = skipws();
	    workp = work;			/* Save name in work	*/
	    while (c != '\n' && c != EOF_CHAR) {
		save(c);
		c = get();
	    }
	    unget();
	    save(EOS);
	    /*
	     * Split #line argument into <line-number> and <name>
	     * We subtract 1 as we want the number of the next line.
	     */
	    line = atoi(work) - 1;		/* Reset line number	*/
	    for (tp = work; isdigit(*tp) || type[*tp] == SPA; tp++)
		;				/* Skip over digits	*/
	    if (*tp != EOS) {			/* Got a filename, so:	*/
		if (*tp == '"' && (ep = strrchr(tp + 1, '"')) != NULL) {
		    tp++;			/* Skip over left quote	*/
		    *ep = EOS;			/* And ignore right one	*/
		}
		if (infile->progname != NULL)	/* Give up the old name	*/
		    free(infile->progname);	/* if it's allocated.	*/
	        infile->progname = savestring(tp);
	    }
	    wrongline = TRUE;			/* Force output later	*/
	    break;

	case L_include:
	    doinclude();
	    break;

	case L_define:
	    dodefine();
	    break;

	case L_undef:
	    doundef();
	    break;

	case L_else:
	    if (ifptr == &ifstack[0])
		goto nest_err;
	    else if ((*ifptr & ELSE_SEEN) != 0)
		goto else_seen_err;
	    *ifptr |= ELSE_SEEN;
	    if ((*ifptr & WAS_COMPILING) != 0) {
		if (compiling || (*ifptr & TRUE_SEEN) != 0)
		    compiling = FALSE;
		else {
		    compiling = TRUE;
		}
	    }
	    break;

	case L_elif:
	    if (ifptr == &ifstack[0])
		goto nest_err;
	    else if ((*ifptr & ELSE_SEEN) != 0) {
else_seen_err:	cerror("#%s may not follow #else", token);
		goto dump_line;
	    }
	    if ((*ifptr & (WAS_COMPILING | TRUE_SEEN)) != WAS_COMPILING) {
		compiling = FALSE;		/* Done compiling stuff	*/
		goto dump_line;			/* Skip this clause	*/
	    }
	    doif(L_if);
	    break;

	case L_if:
	case L_ifdef:
	case L_ifndef:
	    if (++ifptr >= &ifstack[BLK_NEST])
if_nest_err:	cfatal("Too many nested #%s statements", token);
	    *ifptr = WAS_COMPILING;
	    doif(hash);
	    break;

	case L_endif:
	    if (ifptr == &ifstack[0]) {
nest_err:	cerror("#%s must be in an #if", token);
		goto dump_line;
	    }
	    if (!compiling && (*ifptr & WAS_COMPILING) != 0)
		wrongline = TRUE;
	    compiling = ((*ifptr & WAS_COMPILING) != 0);
	    --ifptr;
	    break;

	case L_assert:
	    if (eval() == 0)
		cerror("Preprocessor assertion failure", NULLST);
	    break;

	case L_pragma:
	    /*
	     * #pragma is provided to pass "options" to later
	     * passes of the compiler.  cpp doesn't have any yet.
	     */
	    printf("#pragma ");
	    while ((c = get()) != '\n' && c != EOF_CHAR)
		cput(c);
	    unget();
	    break;
 
#if DEBUG
	case L_debug:
	    if (debug == 0)
		dumpdef("debug set on");
	    debug++;
	    break;

	case L_nodebug:
	    debug--;
	    break;
#endif

	default:
	    /*
	     * Undefined #control keyword.
	     * Note: the correct behavior may be to warn and
	     * pass the line to a subsequent compiler pass.
	     * This would allow #asm or similar extensions.
	     */
	    cerror("Illegal # command \"%s\"", token);
	    break;
	}
	if (hash != L_include) {
#if OLD_PREPROCESSOR
	    /*
	     * Ignore the rest of the #control line so you can write
	     *		#if	foo
	     *		#endif	foo
	     */
	    goto dump_line;			/* Take common exit	*/
#else
	    if (skipws() != '\n') {
		cwarn("Unexpected text in #control line ignored", NULLST);
		skipnl();
	    }
#endif
	}
	return (counter + 1);
}

FILE_LOCAL
doif(hash)
int		hash;
/*
 * Process an #if, #ifdef, or #ifndef.  The latter two are straightforward,
 * while #if needs a subroutine of its own to evaluate the expression.
 *
 * doif() is called only if compiling is TRUE.  If false, compilation
 * is always supressed, so we don't need to evaluate anything.  This
 * supresses unnecessary warnings.
 */
{
	register int		c;
	register int		found;

	if ((c = skipws()) == '\n' || c == EOF_CHAR) {
	    unget();
	    goto badif;
	}
	if (hash == L_if) {
	    unget();
	    found = (eval() != 0);	/* Evaluate expr, != 0 is  TRUE	*/
	    hash = L_ifdef;		/* #if is now like #ifdef	*/
	}
	else {
	    if (type[c] != LET)		/* Next non-blank isn't letter	*/
		goto badif;		/* ... is an error		*/
	    found = (lookid(c) != NULL); /* Look for it in symbol table	*/
	}
	if (found == (hash == L_ifdef)) {
	    compiling = TRUE;
	    *ifptr |= TRUE_SEEN;
	}
	else {
	    compiling = FALSE;
	}
	return;

badif:	cerror("#if, #ifdef, or #ifndef without an argument", NULLST);
#if !OLD_PREPROCESSOR
	skipnl();				/* Prevent an extra	*/
	unget();				/* Error message	*/
#endif
	return;
}

FILE_LOCAL
doinclude()
/*
 * Process the #include control line.
 * There are three variations:
 *	#include "file"		search somewhere relative to the
 *				current source file, if not found,
 *				treat as #include <file>.
 *	#include <file>		Search in an implementation-dependent
 *				list of places.
 *	#include token		Expand the token, it must be one of
 *				"file" or <file>, process as such.
 *
 * Note: the November 12 draft forbids '>' in the #include <file> format.
 * This restriction is unnecessary and not implemented.
 */
{
	register int		c;
	register int		delim;
#if HOST == SYS_VMS
	char			def_filename[NAM$C_MAXRSS + 1];
#endif

	delim = macroid(skipws());
	if (delim != '<' && delim != '"')
	    goto incerr;
	if (delim == '<')
	    delim = '>';
	workp = work;
	instring = TRUE;		/* Accept all characters	*/
	while ((c = get()) != '\n' && c != EOF_CHAR)
	    save(c);			/* Put it away.			*/
	unget();			/* Force nl after includee	*/
	/*
	 * The draft is unclear if the following should be done.
	 */
	while (--workp >= work && *workp == ' ')
	    ;				/* Trim blanks from filename	*/
	if (*workp != delim)
	    goto incerr;
	*workp = EOS;			/* Terminate filename		*/
	instring = FALSE;
#if HOST == SYS_VMS
	/*
	 * Assume the default .h filetype.
	 */
	if (!vmsparse(work, ".H", def_filename)) {
	    perror(work);		/* Oops.			*/
	    goto incerr;
	}
	else if (openinclude(def_filename, (delim == '"')))
	    return;
#else
	if (openinclude(work, (delim == '"')))
	    return;
#endif
	/*
	 * No sense continuing if #include file isn't there.
	 */
	cfatal("Cannot open include file \"%s\"", work);

incerr:	cerror("#include syntax error", NULLST);
	return;
}

FILE_LOCAL int
openinclude(filename, searchlocal)
char		*filename;		/* Input file name		*/
int		searchlocal;		/* TRUE if #include "file"	*/
/*
 * Actually open an include file.  This routine is only called from
 * doinclude() above, but was written as a separate subroutine for
 * programmer convenience.  It searches the list of directories
 * and actually opens the file, linking it into the list of
 * active files.  Returns TRUE if the file was opened, FALSE
 * if openinclude() fails.  No error message is printed.
 */
{
	register char		**incptr;
#if HOST == SYS_VMS
#if NWORK < (NAM$C_MAXRSS + 1)
    << error, NWORK isn't greater than NAM$C_MAXRSS >>
#endif
#endif
	char			tmpname[NWORK];	/* Filename work area	*/

	if (searchlocal) {
	    /*
	     * Look in local directory first
	     */
#if HOST == SYS_UNIX
	    /*
	     * Try to open filename relative to the directory of the current
	     * source file (as opposed to the current directory). (ARF, SCK).
	     */
	    if (filename[0] != '/'
	     && hasdirectory(infile->filename, tmpname))
		strcat(tmpname, filename);
	    else {
		strcpy(tmpname, filename);
	    }
#else
	    if (!hasdirectory(filename, tmpname)
	     && hasdirectory(infile->filename, tmpname))
		strcat(tmpname, filename);
	    else {
		strcpy(tmpname, filename);
	    }
#endif
	    if (openfile(tmpname))
		return (TRUE);
	}
	/*
	 * Look in any directories specified by -I command line
	 * arguments, then in the builtin search list.
	 */
	for (incptr = incdir; incptr < incend; incptr++) {
	    if (strlen(*incptr) + strlen(filename) >= (NWORK - 1))
		cfatal("Filename work buffer overflow", NULLST);
	    else {
#if HOST == SYS_UNIX
		if (filename[0] == '/')
		    strcpy(tmpname, filename);
		else {
		    sprintf(tmpname, "%s/%s", *incptr, filename);
		}
#else
		if (!hasdirectory(filename, tmpname))
		    sprintf(tmpname, "%s%s", *incptr, filename);
#endif
		if (openfile(tmpname))
		    return (TRUE);
	    }
	}
	return (FALSE);
}

FILE_LOCAL int
hasdirectory(source, result)
char		*source;	/* Directory to examine			*/
char		*result;	/* Put directory stuff here		*/
/*
 * If a device or directory is found in the source filename string, the
 * node/device/directory part of the string is copied to result and
 * hasdirectory returns TRUE.  Else, nothing is copied and it returns FALSE.
 */
{
#if HOST == SYS_UNIX
	register char		*tp;

	if ((tp = strrchr(source, '/')) == NULL)
	    return (FALSE);
	else {
	    strncpy(result, source, tp - source + 1);
	    result[tp - source + 1] = EOS;
	    return (TRUE);
	}
#else
#if HOST == SYS_VMS
	if (vmsparse(source, NULLST, result)
	 && result[0] != EOS)
	    return (TRUE);
	else {
	    return (FALSE);
	}
#else
	/*
	 * Random DEC operating system (RSX, RT11, RSTS/E)
	 */
	register char		*tp;

	if ((tp = strrchr(source, ']')) == NULL
	 && (tp = strrchr(source, ':')) == NULL)
	    return (FALSE);
	else {
	    strncpy(result, source, tp - source + 1);
	    result[tp - source + 1] = EOS;
	    return (TRUE);
	}
#endif
#endif
}

#if HOST == SYS_VMS

FILE_LOCAL int
vmsparse(source, defstring, result)
char		*source;
char		*defstring;	/* non-NULL -> default string.		*/
char		*result;	/* Size is at least NAM$C_MAXRSS + 1	*/
/*
 * Parse the source string, applying the default (properly, using
 * the system parse routine), storing it in result.
 * TRUE if it parsed, FALSE on error.
 *
 * If defstring is NULL, there are no defaults and result gets
 * (just) the node::[directory] part of the string (possibly "")
 */
{
	struct FAB	fab = cc$rms_fab;	/* File access block	*/
	struct NAM	nam = cc$rms_nam;	/* File name block	*/
	char		fullname[NAM$C_MAXRSS + 1];
	register char	*rp;			/* Result pointer	*/

	fab.fab$l_nam = &nam;			/* fab -> nam		*/
	fab.fab$l_fna = source;			/* Source filename	*/
	fab.fab$b_fns = strlen(source);		/* Size of source	*/
	fab.fab$l_dna = defstring;		/* Default string	*/
	if (defstring != NULLST)
	    fab.fab$b_dns = strlen(defstring);	/* Size of default	*/
	nam.nam$l_esa = fullname;		/* Expanded filename	*/
	nam.nam$b_ess = NAM$C_MAXRSS;		/* Expanded name size	*/
	if (sys$parse(&fab) == RMS$_NORMAL) {	/* Parse away		*/
	    fullname[nam.nam$b_esl] = EOS;	/* Terminate string	*/
	    result[0] = EOS;			/* Just in case		*/
	    rp = &result[0];
	    /*
	     * Remove stuff added implicitly, accepting node names and
	     * dev:[directory] strings (but not process-permanent files).
	     */
	    if ((nam.nam$l_fnb & NAM$M_PPF) == 0) {
		if ((nam.nam$l_fnb & NAM$M_NODE) != 0) {
		    strncpy(result, nam.nam$l_node, nam.nam$b_node);
		    rp += nam.nam$b_node;
		    *rp = EOS;
		}
		if ((nam.nam$l_fnb & NAM$M_EXP_DEV) != 0) {
		    strncpy(rp, nam.nam$l_dev, nam.nam$b_dev);
		    rp += nam.nam$b_dev;
		}
		if ((nam.nam$l_fnb & NAM$M_EXP_DIR) != 0) {
		    strncpy(rp, nam.nam$l_dir, nam.nam$b_dir);
		    rp += nam.nam$b_dir;
		    *rp = EOS;
		}
	    }
	    if (defstring != NULLST) {
		strncpy(rp, nam.nam$l_name, nam.nam$b_name + nam.nam$b_type);
		rp += nam.nam$b_name + nam.nam$b_type;
		*rp = EOS;
		if ((nam.nam$l_fnb & NAM$M_EXP_VER) != 0) {
		    strncpy(rp, nam.nam$l_ver, nam.nam$b_ver);
		    rp[nam.nam$b_ver] = EOS;
		}
	    }
	    return (TRUE);
	}
	return (FALSE);
}
#endif

