DECUS C LANGUAGE SYSTEM DECUS C Compiler Reference Manual by David G. Conroy Edited by Martin Minow, John D. Morton and Robert B. Denny This document describes the CC compiler itself (including implementation quirks and known bugs), along with procedures for compiling and executing programs under a wide variety of Digital operating systems. DECUS Structured Languages SIG Version of 6-Dec-82  NOTE This software is made available without any support whatsoever. The person responsible for an implementation of this system should expect to have to understand and modify the source code if any problems are encountered in implementing or maintaining the compiler or its run-time library. The DECUS 'Structured Languages Special Interest Group' is the primary focus for communication among users of this software. UNIX is a trademark of Bell Telephone Laboratories. RSX, RSTS/E, RT-11 and VMS are trademarks of Digital Equipment Corporation.  CHAPTER 1 INTRODUCTION CC is a multipass C compiler for the PDP-11 that runs under the RSX-11, VMS (compatibility mode), RSTS/E, and/or RT-11 operating systems. Except for the restrictions noted in a later section, it compiles programs as per the description of C in the Unix Seventh Edition documentation or the book The C Programming ___ _ ___________ Language by Brian Kernighan and Dennis Ritchie (Englewood ________ Cliffs, NJ: Prentice-Hall, ISBN 0-13-110163-3). In general, the code produced by this compiler is quite well optimized for the PDP-11. Quality of the generated code is, however, dependent on the programmer's understanding of both the language and of the target machine (the PDP-11). In particular, proper use of register variables and the prefix '--' and postfix '++' operators with pointers can result in surprising reductions in size and increases in speed. Experience is the best teacher. Restrictions 1. On VMS V3.0 and V3.1, Decus C cannot compile source files in "Stream format" as generated by Vax-11 C and by network copy operations from e.g., RSTS/E V7.1. These files must first be converted to "variable-length" format by using the scopy.c utility or an editor. Using TECO, you can convert such files by the following: teco *erINPUT.FIL$$ *ewOUTPUT.FIL$$ *ex$$ Do not use "edit backup" as this will not change file attributes. The C runtime library contains a dynamically-installed, release-specific patch to the RSX File open routine on VMS compatibility mode that enables C programs to read stream-format files. It  CC Reference Manual Page 1-2 Introduction is highly likely that a maintenence release of VMS will require changes to the run-time library (INIT.MAC) to change or remove the patch. 2. On VMS V3.0, there is a problem passing Unix-style options to the compiler, assembler, and running programs. When compiling or assembling C programs on VMS, the options must follow the file name. Software in the run-time library "fixes" this problem, but may require modification for subsequent releases of VMS. 3. On RSX-11M systems, it is important to taskbuild C programs with checkpointing enabled as they must expand the task image during execution. 4. On all RSX modes, programs should be task-built on the target system -- building images on RSX-11M that will be run on VMS does not seem to work very well. One problem is that file control service differ between the various implementations of RSX-11M.  CHAPTER 2 USING THE C COMPILER Since the C compiler runs on so many operating systems, command information is presented in individual sections for the various operating system families, followed by a common section describing usage and the switches needed to control compilation. 2.1 VMS, RSX-11, and RSTS/E RSX Emulation Mode ____ _______ ___ ______ ___ _________ ____ After the appropriate setup sequence (described in a later section) has been executed, the compiler may be invoked as follows: XCC file [-switches] or RUN C:XCC CC> [type command line here] The specified file is compiled and the resulting assembly code is placed in a file having the same name as the source file but with a filetype of 'S'. The default filetype for source files is 'C'. The file will be written to the user's current default account. On RSTS, this is the account under which the user is logged in. Diagnostics are written to the standard output. The diagnostic stream may be redirected by means of the '>' or '>>' conventions: '>filename' writes diagnostics to the named file, while '>>filename' appends diagnostics to the named file. This is compatible with Unix usage. Only a single file may be compiled at one time. Wildcards are not legal in file names. The resulting assembly language is assembled with AS as follows: XAS file -d The generated code should never have any assembly errors. The '-d' switch deletes the input file ('file.s') unless an error is  CC Reference Manual Page 2-2 Using the C compiler detected. Object files are compiled into executable images by using one of the RSX-11M task builders. The simplest command sequences possible on native RSX-11M are: >FTB prog/CP=objects,LB:[1,1]C/LB or >TKB prog/CP=objects,LB:[1,1]C/LB Alternatively, on VMS, RSX-11, or RSTS/E RSX, the task builder may be invoked explicitly: TKB>prog/CP,map=objects,LB:[1,1]C/LB (native RSX) TKB>// TKB>prog,map=objects,C:C/LB (VMS, RSTS/E RSX) TKB>// NOTE On native RSX-11M, the C OTS is normally kept on the system library device LB: in UIC [1,1], and cannot be referenced as "C:C/LB". On RSX-11M PLUS, there is a 'libuic' on which the C library would be kept, and may not be [1,1]. On other systems the library may be referenced as "C:C/LB". If a program uses large amounts of automatically-allocated storage, the "STACK = number" option should be specified to the task builder. A C program may be built with the 5.6K PLAS overlayed FCS resident library FCSRES using the "LIBR = FCSRES:RO" option. 2.2 RT-11 or RSTS/E RT-11 Emulation Mode _____ __ ______ _____ _________ ____ After the setup sequence described in a later section has been executed, the compiler may be run as follows: RUN C:CC file [/switches] (native RT-11) CC file [/switches] (RSTS/E) or RUN C:CC CC> file [/switches] or CC> file.s,file.tm1,file.tmp=file.c [/switches] The latter case explicitly creates and saves the intermediate code (.TM1) and expanded source (.TMP) files. Normally, these are needed only when debugging the compiler. Note that if you  CC Reference Manual Page 2-3 Using the C compiler do not specify extensions for the intermediate files, they will be given the default of '.TMP' for the expanded source and '.TM1' for the intermediate code file. The resulting assembly language is assembled with AS as follows: RUN C:AS file/d (RT-11) AS file/d (RSTS/E) or RUN C:AS AS> file/d The generated code should never have any assembly errors. The '/d' switch deletes the input file ('file.s') unless an error is detected. Object modules are compiled into executable images by using the RT-11 linker: LINK/BOT:2000 prog,objects,C:(SUPORT,CLIB) (RT-11) LINK save,map=objects,C:SUPORT,C:CLIB/B:2000 (RSTS/E) The two library files contain the actual main program (in SUPORT) and the RT-11 run-time support library. The start address must be at least 2000 to allow for dynamic storage by subroutines. If the '/BOTTOM' option or the '/b' switch is omitted, executing printf() may cause the program to abort with an 'M-trap to 4' message. 2.3 Compilation Notes ___________ _____ MACRO-11 may NOT be used to assemble the output of CC. CC expects that its assembler can perform certain optimizations (most notably branch adjustment) not performed by MACRO-11. The title of the object file will be set to the first six characters of the source file name. The "ident" defaults to all blanks, but may be set with the (non-transportable) "ident" statement in the source code. The compiler writes on files 'file.TMP' and 'file.TM1'. It is, therefore, unwise to keep important things in files with these filetypes. The '.TMP' file contains the C source with #include and #define statements processed. This is the input to the compiler proper. The '.TM1' file contains the intermediate code generated by the compiler parser. This is the input to the code generator.  CC Reference Manual Page 2-4 Using the C compiler 2.4 Switches ________ Under RSX modes, switches are given as single letters preceeded by a minus sign: XCC test -w (On Vax VMS, the switch must follow the filename to be recognized. This is a permanent restriction.) Under RSTS/E or RT-11, switches are given as single letters preceeded by a slash: RUN C:CC test/s (native RT-11) Case is not significant. All switches are shown, although many are of interest only to persons charged with maintaining the compiler. The following switches are defined: a This optional argument causes the compiler to chain to the assembler, assembling the .S file. Note: it works on native RSX-11M and RT-11 systems, but will not work on RSX-11M emulated on VMS or RSTS/E. Also, on native RT-11, the switch may not be given in a command file. The chain takes place only if no errors were detected by the compiler. On RSX-11M, AS is invoked as if the "XAS -D file" command were given. On RT-11, AS in envoked as if "RUN C:AS file/D" were given. d This optional argument causes the compiler to treat floating-point according to C language specifications: when calling a function, single-precision floating point variables and constants are converted to double-precision. Also, functions always return double-precision results. This overrides the installation default. e This optional argument causes in-line code to be generated for multiply, divide, xor, and shift operations, using the PDP-11 extended instruction set (EIS). Note: when CC is installed, this may be made the default. The 'n' switch disables in-line generation of EIS operations. f This optional argument causes the compiler to pass single-precision floating point variables and constants to functions without extending them to double-precision. It is thus incompatible with C language standards, but is more efficient for certain applications. This may be made the default when CC is installed. l This optional argument causes internal code trees to be written (as comments) to the .S output file. This  CC Reference Manual Page 2-5 Using the C compiler option is for compiler maintanence. m This optional argument disables the preprocessor. The source (.C) file has been processed by the mp macro preprocessor. n This optional argument causes the compiler to call subroutines for multiply, divide, xor, and shift operations. When CC is installed, this may be made the default. p This optional argument causes profiling code to be compiled (see the section on profiling). r Define rsts before compiling source code. v Ignored. In previous releases it caused the compiler to echo the current line of the source onto the error stream whenever an error is detected. (This is now permanent). In most cases, the line echoed is not the line containing the error, because the parser usually has to read the next symbol of the source to determine that an error exists. It will usually be within 1 line, which should be close enough to locate the error. w This optional argument supresses the "variable was defined but never referenced" warning message. x This optional argument is for debugging the compiler. It causes the compiler to retain intermediate files and to print timings of each compiler pass. z This optional argument causes the compiler to execute a breakpoint trap when entering each overlay segment. It is used only for debugging the compiler. It is listed here only to as a warning for the fumble-fingered typist. 2.5 Setup of the Compiler _____ __ ___ ________ Before using the C compiler, it must be made known to the operating system. This differs slightly for the various systems.  CC Reference Manual Page 2-6 Using the C compiler 2.5.1 Setup under VMS _____ _____ ___ The following setup (or something much like it) should be added to your LOGIN.COM file: $ ASSIGN DBA0:[PUBLIC] C $ XCC :== $C:CC.EXE CC $ XAS :== $C:AS.EXE AS The above enables use of the above-mentioned command sequences. If your compiled C program is to make use of the (Unix-compatible) startup sequence, you must proceed as follows: $ XCC foo $ XAS -d foo $ MCR TKB foo,foo=foo,c:c/lb Then, you must type: $ FOOBAR :== "$DISK:[ACCOUNT]FOO.EXE" $ FOOBAR Unix-style parameters The '$' tells the VMS command interpretor that a command is being defined. On VMS, the 'task name' will be passed to the program as argv[0] when the program starts. 2.5.2 Setup under RSTS/E RSX emulation mode _____ _____ ______ ___ _________ ____ Under RSTS/E, the system manager must define the XCC and XAS CCL commands and the C: system-wide logical in a start control file such as the following (the account may be chosen to meet the system manager's needs): RUN $UTILTY ? ADD LOGICAL SY:[5,2]C ? CCL XAS-=C:AS.TSK;0 ? CCL XCC-=C:CC.TSK;0 ? CCL CRUN-=C:CRUN.*;30000 ? EXIT The CRUN program eumulates a CCL startup sequence for compiled C programs. 2.5.3 Setup under RSX-11M _____ _____ _______ As it is assembled, the CC compiler looks for #include files of the form '' on logical device 'C:'. This will not work on RSX-11M, so the distributed compiler build file does a 'GBLPAT' to the location labeled 'SYSINC' to change it to "LB:[1,1]". On an RSX-11M PLUS system, you should change this  CC Reference Manual Page 2-7 Using the C compiler to your 'libuic' if necessary by editing MMAKCC.CMD. Install CC and AS as MCR external commands '...XCC' and '...XAS', respectively. The CC compiler MUST be installed ____ checkpointable in a mapped system to allow for task extension. If you have an unmapped system, or do not have the 'extend task' directive in your executive, install CC with an 'INC=20000' at least, more if you get compiler aborts. 2.5.4 Setup under RT-11 and RSTS/E RT-11 mode _____ _____ _____ ___ ______ _____ ____ Under RT-11, setup consists of simply ASSIGNing a physical device to the logical device "C:". The compiler and assembler .SAV files, the SUPORT.OBJ module, and the library CLIB.OBJ should be placed on device 'C:'. You can make the assignment of device 'C:' as part of the startup command file, e.g.: .ASSIGN RK0: C: This compiler has been built and used under RT-11 V3B and V4. It has run on a PDP-11/34, a PDP-11/05 and on PDT150 systems. Under RSTS/E, the system manager must execute a startup control file such as the following: RUN $UTILTY ? ADD LOGICAL SY:[5,2]C ? CCL AS-=C:AS.SAV;8192 ? CCL CC-=C:CC.SAV;8220 ? CCL CRUN-=C:CRUN.*;30000 ? EXIT The CRUN command emulates a CCL startup for compiled C programs. 2.6 Invoking Compiled C Programs ________ ________ _ ________ When your program begins to execute and the startup module sees that something more than just "run foo" or the like has been typed, a Unix C setup sequence is emulated, including I/O redirection and command argument processing. The startup module does not expand wild-card filenames, however. On native RSX-11M/M+ this feature is available in three flavors. If the program has been installed as an "MCR external command" (with a task name of "...xxx"), typing "xxx command line" will start up the program and the command line will be parsed into arguments. Installingg a program as an MCR external command requires that you be a privileged user. If the program is manually run ("run xxx"), it will prompt with the installed task name ("TTn" if not already installed or run with a specified task name), to which you may replay with the command line. In RSX-11M V4.0, it is possible to run an uninstalled task and supply a command line in  CC Reference Manual Page 2-8 Using the C compiler one operation by using the "/CMD = " option on the run command. This form of program activation is commonly called a "flying install", and is particularly convenient for use with indirect command files which include C program activations. On RSX emulated under RSTS/E, the CRUN CCL command may be used to pass arguments to C programs: "CRUN PROGRAM args", while on VMS, C programs may be installed as foreign commands. On RT-11, if no command line has been passed, the module prompts "Argv: " and accepts a single line which is then parsed into command arguments. This can be disabled by defining the $$narg global symbol as described in the library documentation. NOTE On native RT-11, a command line passed via "RUN prog ..." which has more than one 'token' or 'word' in it gets parsed by the RT-11 monitor before it ever gets to the C program. See the documentation in the RT-11 manual on the 'RUN' command. It causes an "=" sign to get inserted, and the order of arguments is shuffled. To get around this, either use the "RUN prog" and answer the "Argv: " prompt with the command line, or enclose the command line in some delimiter plus a space, e.g.: RUN C:prog [ command line ] which tacks an extra token on to the command line that looks like "]=[" for the case above. There is no problem on command lines which have one token. In some applications, it may be desirable to control what is generated for a command line prompt, or to eliminate the prompt altogether. If you include an initialized string declaration in your program of the form: char $$prmt = "Prompt>"; Then "Prompt>" will be the prompt string instead of "Argv" or the task name. In addition, if you include an initialized integer declaration in your program of the form: int $$narg = 1; then the explicit command line prompt will be supressed altogether. Note that these must be global initializations; do not prefex either of these with "static".  CC Reference Manual Page 2-9 Using the C compiler NOTE It is the declaration itself that causes the change in prompting behavior. The delclaration establishes a definition for a global symbol, supressing use of a default module in the C library at program link time. If you include in your command line an argument of the form '>file', standard output will be written to the indicated file. If you include an argument of the form '>>file', standard output will be appended to the file (creating it if necessary). Append does not work on RT-11 modes. If you include an argument of the form ' parameter in the command definition as shown above. o On RSTS/E, this will be the CCL name or the program name as passed to the CRUN program. o On RT-11 (or on RSTS, by default, if no name can be found), this will be the string 'Argv: '.  CC Reference Manual Page 2-10 Using the C compiler For example: /* * Echo arguments */ main(argc, argv) int argc; char *argv[]; { register int i; printf("Program \"%s\" has %d parameters\n", argv[0], argc); for (i = 1; i < argc; i++) printf("Argument %d = \"%s\"\n", i, argv[i]); } The above program is executed as follows on VMS: $ ECHO abc "def ghi" Program "ECHO" has 3 parameters Argument 0 = "ECHO" Argument 1 = "ABC" Argument 2 = "def ghi" Notice that unquoted arguments are converted to upper case by the operating system. Under RSTS/E, a C program may be installed as a CCL command or the program may be started using the CRUN CCL command which emulates a CCL startup for C programs. 2.7 Predefined Symbols __________ _______ Before reading the program source file, the C compiler defines several symbols (which may then be tested with '#if' or '#ifdef' statements): decus This is the Decus compiler. nomacarg This version does not allow macros with arguments. pdp11 Generate code for the PDP-11. rsts The -r (/r) switch was given when CC was invoked. rsx The RSX compiler (or) rt11 The RT-11 compiler  CC Reference Manual Page 2-11 Using the C compiler _DATE The compilation date and time as a quoted string. c_rsts The compiler is running on RSTS (under either rsx or rt11 emulation mode). c_vms The compiler is running on VMS under rsx compatibility mode. Except for '_DATE', the symbols will be #defined equal to 1. Note that the rsx, rt11, c_rsts, and c_vms symbols are automatically #define'd depending on the base and execution-time operating system. The rsts symbol is only #define'd if the compiler was invoked with the '-r' or '/r' switch. Suitable use of these switches allows the programmer to generate system-specific code from a common, transportable, source file. Be warned, however, that using the c_rsts or c_vms symbols may result in non-transportable programs, as these switches (which parallel the Decus C runtime library $$rsts and $$vms switches) are best set at program execution (rather than program compilation). You may be interested to know that Vax-11C predefines "vax", "vms", and "vax-11c", and many Unix based compilers predefine "unix". 2.8 Program Sections _______ ________ Two directives, psect and dsect, have been added to the C language syntax to permit programmer control over the program sections generated. This was needed to permit C programs to be configured for read-only memory systems and simplifies writing RSTS/E run-time systems in C. Warning These directives are supported only on Decus C and the AS assembler distributed with Decus C. Programs using them are not transportable to other C compilers. They can be "hidden" by suitable use of the "#ifdef decus" pre-processor directive. To change all default sections within a compilation, use the psect directive as follows: /* * program */ int normal; /* goes into c$code section */ func() { /* goes into c$data section */  CC Reference Manual Page 2-12 Using the C compiler } .s psect "xxx"; /* Name special sections */ int funny; /* goes into 'xxxdat' section */ subr() { /* goes into 'xxxcod' section */ } .s psect ""; /* Null string means normal */ int norm; /* goes into c$data section */ function() { /* goes into c$code section */ } The dsect directive has the same syntax as the psect directive. It affects only the allocation of global and static data. The entire string is used. int normal; /* Goes into the c$data section */ dsect "mydata"; /* Switch to my section */ int pure1; /* Goes into the mydata section */ func() { /* Goes into the c$code section */ static int more; } dsect ""; /* Revert to default c$data */ The dsect directive uses the first six non-blank (and non-control) characters of the quoted argument. Your program will not compile correctly (and you will not be warned) if the dsect argument matches any other program section. For example, dsect "c$strn"; Will not work correctly. The psect directive takes the first three non-blank (and non-control) bytes of the quoted argument together with 'cod', 'dat', etc. to form program sections. The psect and dsect directives may not be given within a function. If a null string (or one with no non-blank text) is given, the compiler will revert to the standard program sections. (Dsect changes only the data section, while psect changes all.) It is possible to append program section attributes to psect and dsect directives by following the section name with a space or tab, then with the attribute bytes. The following are allowed: d Data space i Instruction space (default) r Read only (default for c$code) w Read write (default for others) g Global scope l Local scope  CC Reference Manual Page 2-13 Using the C compiler o Overlay c Concatenate (default) Thus, you can specify a read-only data section by the following: dsect "data r" If the first byte of the psect/dsect string is blank (and there are attributes), the attributes of the default code and data sections are modified. The compiler supports the following program sections: c$code xxxcod Executable code c$data xxxdat Global and static data c$strn xxxstr Strings c$mwcn xxxmwc Multi-word constants c$prof xxxprr Profile tables If the psect string argument is less than three characters long, it will be padded with '$'. Normally, c$code, c$strn, and c$mwcn sections are allocated to read-only memory, while $data dnd $prof sections may be allocated to read-write memory. (The C language standard assumes that strings and constants are writable.) Note that the compiler will always generate references to the standard program sections, even if a psect directive is the first in the file. While Decus C cannot specify program section attributes (read-only, etc.), these can be specified by task-builder control files. Strings, by default, are written to the c$strn program section. String vectors, however, require two allocations: a pointer value (in c$data) and a character string (in c$strn). If the program must control allocation of both the pointer and the string, the psect directive must be used. Your program will not compile correctly if you allocate pointer and string values in the same program section. dsect "rodata r"; char *entry[] = { "string1", "string2" }; In the above, entry[0] will be in read-only program section rodata, while "string1", etc. will be in the read-write section c$strn. psect "xxx"; dsect "xxxdat r"; /* Force readonly */ dsect "rodata r"; char *entry[] = { "string1", "string2"  CC Reference Manual Page 2-14 Using the C compiler }; psect ""; In the above, "string1" was allocated in read-only program section "xxxstr", while the entry[] pointers were allocated in read-only program section "rodata". 2.9 Module Identification ______ ______________ Normally, the CC compiler generates a '.title' directive with the title set to the first six characters of the filename, and no '.ident' directive. It is possible, however, to specify the module identification for a C program or module by using the "ident" directive. For example: ident "V01.00"; /* Version 1.0 */ will cause ".ident /V01.00/" to be generated in the output assembly code. This is useful when you need to identify programs and/or program components in libraries, load maps, and so forth. Warning This directive is supported only on Decus C and the AS assembler distributed with Decus C. Programs using "ident" are not transportable to other C compilers. "ident" should be "hidden" by suitable use of the "#ifdef decus" pre-processor directive. 2.10 Profiling _________ The profiler permits the accumulation of function call statistics during the execution of a program. If any of the files comprising a program were compiled with the profile option (and at least one of them has been called) then a call profile, listing the function name and the number of calls, will be written to file 'profil.out' when the program terminates. Also, if the program terminates because of a fatal error (such as an illegal memory reference), a register dump and call trace will be printed on the command terminal. The run-time library contains several functions that can be called to dynamically print flow trace information. For more  CC Reference Manual Page 2-15 Using the C compiler information, consult the C Runtime Library manual. 2.11 Diagnostics ___________ There are two general classes of diagnostics; those that relate to compiler conditions, and those that relate to errors in the user's program. The only type of compiler condition messages the user should see are those of the form "Cannot open .... file". These mean exactly what they say. Other compiler condition messages are "Abort in phase x", "Abort loading phase x" and "Trap type x", where "x" is replaced by some small constant. If the abort is in pass two, you could be using a syntactic construction (such as bit fields) that is supported by the syntax analyser, but not by the code generator. The abort could also be caused by a "known bug" in the parser that permits certain syntactically-incorrect statements to be parsed (into garbage) which cannot be handled by the code generator. For example, printf("foo"): /* Note colon, not semicolon */ printf("bar"); The syntax analyser thinks the first printf() is part of a valid conditional statement, such as: i = (j == 0) ? printf("foo") : printf("bar"); The abort message mentions a illegal colon operator. Of course, you could also be the proud owner of a compiler bug. Report your find to a guru. Remember the register dump and save your source file and both temporary files. They are important. Errors in the user's programs are reported in English, tagged by the linenumber (which may be off by 1). Because of the nature of the language, errors sometimes snowball. If you are greeted by thousands of error messages, try fixing up the first few. You may be pleasantly surprised. The following are common sources of 'thousands of errors': o If there is a missing right brace within a function, all succeding functions will miscompile. The error message will include a tag of the form "within function xxxxx", where "xxxxx" is the function with the missing brace. o If there is a missing right parenthesis in an if or  CC Reference Manual Page 2-16 Using the C compiler while statement which is followed by a left brace, the syntax analyser will 'lose' the brace, causing many messages: if ((foo = fopen("abc.def", "w") == NULL) { ... o In general, if the error message is "illegal expression", that is (probably) the current line. If the message is "illegal statement", you should look at the previous statement.  CHAPTER 3 RUNTIME ENVIRONMENT This description of the C runtime enviornment is sketchy. The best reference is compiler generated code, and any question regarding 'how does it ....' can usually be answered by compiling a suitably contrived program. 3.1 Program Sections _______ ________ The C compiler uses 5 program sections whose names may be overrided by the programmer, as described previously. C$CODE is used for all executable code. C$DATA is used for all static (read-write) data. The compiler issues the .even assembler directive to force word-alignment when necessary. C$STRN is used for the bodies of all literal strings. C$PROF is used to hold the names of functions and reference counts for the profiler. It contains read-write data. C$MWCN is used for multi-word constants (long integers and floating-point values), as well as for transfer tables for the switch statement processor. It contains read-only data. The compiler does not write a symbol table as such, making debugging a chore. 3.2 Register Usage ________ _____ R5 is used as an environment frame pointer. It points to the highest address of the stack frame of the current function. In MACRO-11 programs, symbols C$PMTR and C$AUTO may be used to refer to the first parameter and first automatic variable, respectively. Thus, when writing a MACRO subroutine, the macro  CC Reference Manual Page 3-2 Runtime Environment program should contain: MOV C$PMTR+(R5), Dst to access parameters (the first parameter_number is 0). (This cannot be done when using the AS assembler.) To access automatic variables, the recommended sequence is: MOV C$AUTO-(R5), Dst Where the first variable_number is numbered 1. (This cannot be done when using the AS assembler). Registers R2, R3 and R4 are used as register variables. The first register variable to be declared goes in R4, the second in R3 and the third in R2. Any register not used as a register variable can be used as a temporary. Registers R0 and R1 are always scratch registers. 3.2.1 Calling Sequence _______ ________ The first instructions in a C function are a 'JSR R5,CSV$' and a subtract to claim stack space. The 'CSV$' routine points R5 at the new stack frame and pushes registers R4, R3, and R2 onto the stack (Note that the character '$' in the CC/MACRO environment, is represented by '~' in the AS environment). R0, R1 and the floating point registers are NOT saved. This means that if a C function is called asyncronously (i.e. from an AST routine) the caller must arrange to save these registers or be prepared to face the music. The RSX-11M executive interface library has the components needed to allow a C function to serve as an AST service routine. Functions return via a 'JMP CRET$'. The return value is in R0 (for ints, chars and pointers), R0-R1 (for longs, high part in R0) or AC0 (floats and doubles). The caller passes control to a function by first pushing the arguments (from right to left) onto the stack, calling the function via a 'JSR PC,FUNCTION', and popping the arguments off of the stack when the function returns. All arguments are passed as ints, longs (push low part, then push high part) or doubles. Characters are passed as integers; floats are passed as doubles.  CC Reference Manual Page 3-3 Runtime Environment 3.3 Global Symbols Containing Radix-50 '$' and '.' ______ _______ __________ ________ ___ With this version of Decus C, it is possible to generate and access global symbols which contain the Radix-50 '.' and '$'. The compiler allows identifiers to contain the Ascii '$', which becomes a Radix-50 '$' in the object code. The AS assembly code shows this character as a tilde (~). The underscore character (_) in a C program becomes a '.' in both the AS assembly language and in the object code. This allows C programs to access all global symbols: extern int $dsw; . . . printf("Directive status = %06o\n", $dsw); The above prints the current contents of the task's directive status word. NOTE Use of '$' in programs may not be transportable to other C compilers. It is supported on Vax-11 C. Be careful about referencing global 'equates' in C. These are NOT address labels. For example, if a program declares "extern int is_suc;", where IS.SUC is externally equated to 1, and then use is_suc in an expression, you will get the contents of location 1 (and ________ __ ________ _ probably an odd address trap!). It is possible (but unbeautiful) to get around this by suitable use of casts: extern char *is_suc; define IS_SUC ((int) &is_suc) ... return (IS_SUC); 3.4 Virtual Addresses in C _______ _________ __ _ When interacting with executives and MACRO-11 programs at the low level made possible by C, it is likely that virtual addresses (i.e., mapped memory addresses) will be manipulated and used as pointers. This is particularly true when using the RSX-11M interface library memory management functions. Also, the C storage allocator functions return virtual addresses, not C pointers. It is important to make this distinction, owing to C's powerful address arithmetic capabilities. This is discussed in The C Programming Language by Kernighan and Ritchie, sections ___ _ ___________ ________ 5.4 and 5.6.  CC Reference Manual Page 3-4 Runtime Environment While virtual addresses are represented internally as unsigned integers, it would be wise for the programmer to adopt the convention of defining them as character pointers. To make things crystal clear, one might #define ADDR char * making ADDR synonymous with 'character pointer'. 3.5 Profiler ________ When a program is compiled with the 'p' option, the standard function entrance sequence is replaced by a "JSR R5,PCSV$". Immediately following the call is a pointer to a counter word followed by the name of the function as a null terminated string. The 'PCSV$' routine increments the zero word on every call: .psect c$code entry: jsr r5,pcsv$ .word prof .psect c$prof prof: .word 0 ; Incremented at each call .asciz /entry/ ; Function name .even .psect c$code ... The printing of the profile is arranged by having 'PCSV$' stuff a global cell '$$PROF' with a pointer to the profile print routine. This routine (called automagically on exit) scans through memory looking for "JSR R5,PCSV$" instructions, and printing the statistics to the file 'profil.out' via 'fprintf'. Compiling a program with profiling has several additional advantages: o If the program fails because of an unexpected trap to the operating system (and the profile collection code was executed at least once), a register dump will be printed on the command terminal and the program will exit by calling error(). o If the function's execution would cause the stack pointer to go below 600 octal, the program will be aborted after printing an error message. o It is possible to obtain a dynamic trace of the flow of a program by assigning the file descriptor of an open file to global variable '$$flow'. For example: #include  CC Reference Manual Page 3-5 Runtime Environment extern FILE *$$flow; main () { $$flow = fopen("trace.out", "w"); process(); } Note that the program may execute $$flow = stdout; to write the trace to the command terminal. To turn off tracing, close $$flow and set $$flow = NULL. ___ o The caller() function may be used to obtain the name (in ascii) of a routine's caller: main () { subr(); } subr () { printf("%s\n", caller()); } When subr() is executed, it will print "main". o The calltr() function may be used to print a trace of calls from main() to the function that called calltr(): main () { subr(); } subr () { calltr(stdout); } When subr() is executed, it will print: [ main subr ] on the standard output file. If some routine in the call trace was not compiled with profiling, the octal address of the routine's entry point is printed. If the routine gets confused (perhaps because the program is exiting due to a trap), it prints "". o If the program exits by calling error() and the profile collection code was executed at least once, a call trace  CC Reference Manual Page 3-6 Runtime Environment will be printed on the command terminal. 3.5.1 Example _______ A function max(a, b), which returns the maximum value of its two integer arguments may be written as follows: max(arga, argb) int arga; int argb; { return((arga > argb) ? arga : argb); } After compilation, the following .S code will be generated: .psect c$prog max: jsr r5,csv$ cmp 2(sp),4(sp) blt .0 mov 2(sp),r0 br .1 .0: mov 4(sp),r0 .1: jmp cret$  CHAPTER 4 INCOMPATIBILITIES AND RESTRICTIONS The language accepted by the compiler is the language described in the Unix Seventh Edition documentation (and Kernighan and Ritchie) with several exceptions. The file 'C:CBUGS.DOC' contains a current list of bugs. These should be regarded as restrictions -- anything that was easy to fix has been fixed. 4.1 Restrictions ____________ o Initialization of automatic and local static variables is not supported. o Enumerations are not supported. o Bit fields do not work -- attempting to use bit fields will cause the compiler to abort with a "missing code table entry" error. o Symbols defined as global may not be redefined as local to a function. o Variables may only be declared at function entrance. The latest C language specification allows variable declaration at any block entrance. o Only FPU (11/45, 11/70) floating point is supported. There is no code to support the FIS (11/40 11/03) hardware, nor is there code present to emulate floating point. o The compiler does not support 'old-style' assigned binary operators. These will generally result in syntax errors. One exception (which started the whole mess) is "foo =- 6". This will be accepted by the compiler. Unfortunately, it will generate "foo = (-6)" when the program probably wanted "foo = foo - 6". You have been warned. o In order to ease conversion to Vax-11 C, variable  CC Reference Manual Page 4-2 Incompatibilities and Restrictions initialization must be written "int foo = 123;". The 'old-style' "int foo 123;" compiles correctly, but will generate an annoying warning message. o The include statement has two modes: #include "filename" Includes the fully-qualified file. #include Includes the library file, equivalent to: #include "C:filename" (LB:[1,1] on RSX) o Macros (#define statement with arguments) do not exist. o As noted in the library documentation, the following built-in function may be overridden by the C-program: wrapup() Called when the program exits. 4.2 Incompatibilities _________________ There are several incompatibilities between the current DECUS compiler and earlier versions which had been distributed by various DECUS special-interest groups. Those known (and the implications) are: o The RSX compiler's subroutine calling sequence has been changed to match the RT-11 compiler's (and Unix's). This means that all user-written assembly-language code must be modified. The calling sequence appears to be compatible with the Unix and Whitesmith compilers, although library names are different. Also, the Whitesmith compiler has several optimizations in its subroutine calling sequence that are not present in this compiler. o The underscore character now generates a RAD50 dot, instead of a dollar-sign. The compiler allows dollar-signs in local and global variables. Thus, C programs can now access all PDP-11 global symbols. Because of the change of the meaning of underscore, all user-written assembly-language code must be modified. o Program section names have been changed. o I/O library conventions now generally follow the Unix V7 and Vax-11 C definitions. There are several implications. In general, however, all C-language I/O calls should be examined. The major problems are described below.  CC Reference Manual Page 4-3 Incompatibilities and Restrictions o fopen("filename", "openmode") follows the RSX-library and Unix V7. This is incompatible with Unix V6 and the old RT11 library call. It is compatible with current Unix and Vax-11 C usage. o fgets(buffer, sizeof buffer, fd) requires the second buffer size parameter, and does not remove the trailing newline. This follows Unix V7 I/O conventions. fgetss() is a new function, identical to fgets() except that it removes the trailing newline. fgetss() is compatible with the fgets() function in previous versions of the Decus compiler. o fputs(buffer, fd) does not append a newline to the record. This follows Unix V7 I/O conventions. fputss() is a new function, identical to fgets() except that it appends a trailing newline. 4.2.1 Conversion from Unix __________ ____ ____ It is expected that many programs will be converted to the Decus compiler from Unix. While trivial programs will require no work whatsoever, most programs will require hand editing. Note the following: o Floating point requires floating-point hardware. Many floating-point variables can be recoded as long integers (large counters, for example). Anything else cannot be converted at all. o Unix V6 assigned binary operators must be converted to the new format. Most of these will be caught by the syntax analyser. Note, however, that "foo =- 6" will parse, generating incorrect code. o The Decus compiler has a 500 word expression stack. This means that many complex expressions (especially those with embedded conditional statements) will cause the compilation to abort. This requires rewriting. o The Decus compiler lacks macros with arguments. Many of these can be rewritten as function calls. If the program intentionally makes use of the fact that macros are expanded in-line, hand-editing will be needed. Note also that only one level of indirect (#include) file is supported. o The previous release of the compiler treated nested comments as follows:  CC Reference Manual Page 4-4 Incompatibilities and Restrictions /* begin comment /* nested comment */ more comment comment ends here: */ The current version is compatible with other C compilers: /* begin comment... /* this generates a warning comment ends here: */ o Unix "native" I/O is not supported. Thus, any program using read(), write(), open(), or creat() will require extensive modification. Note also that fopen() operates quite differently in the standard I/O package than it did in Unix V6. This requires rethinking but is fairly straight-forward. Also, note that only a limited file random-access capability is present and that files may not be updated by this package. o Very large programs (which depend on Unix's ability to generate programs with separate instruction and data space) must be redone using the linker (task-builder) overlay capability. Non-trivial. o Programs that use large amounts of local storage (allocated on function entrance) must be linked with enough stack space. When testing a program, it is highly recommended that the program be compiled with profiling as this enables a stack overflow check on function entrance. Note that, on Unix, the runtime stack and free storage (allocated by malloc()) compete for the same memory, relieving the programmer of the need to specify the maximum stack size. Unix programs that exploit this fact may prove hard to convert to the Decus compiler. Note that Unix manages stack and heap (malloc) memory quite differently than Decus C. In general, you should avoid large stack allocations in Decus C. o On the current (V3.1) release of VMS, problems with the RSX AME make parsing the command line particularily difficult. The RSX command line parser does not put a space between the command and the first argument, if the first argument starts with a minus sign '-'. Thus, if the programmer types $ command :== $disk:[directory.list]filename $ command -arg1 arg2 the Decus C initialization code sees  CC Reference Manual Page 4-5 Incompatibilities and Restrictions "disk:[directory.list]filename-arg1 arg2" In general, the programmer should be alert to such minor incompatibilities that do exist.  APPENDIX A FILE CBUGS.DOC (17-SEP-80) The following is a reproduction of the CBUGS.DOC file distributed with the DECUS C system. This document contains the current list of errors and restrictions in the DECUS C language system. As errors are removed, they will be deleted from this file. Anything in this file should be regarded as permanent. The error list is in no particular order. ** ** Long input lines crash the compiler ____ _____ _____ _____ ___ ________ ** If the source file contains a long input line, the compiler will crash with an "abort in phase 0" message. One source of compiler aborts is an input line that becomes very long when "#define" statements are expanded. "long" in this context means 132 bytes. ** ** Bad syntax may crash the compiler ___ ______ ___ _____ ___ ________ ** The construction return((c < 0) -1 : 0); (with a missing "?") aborts in phase 2 with a "missing code table" error. It should yield a syntax error message. A similar problem occurs when the programmer writes ':' when ';' was intended: printf("foo\n"): /* Note colon */ printf("bar\n"); or c = 'U': /* Note colon */  CC Reference Manual Page A-2 Bugs in the Decus C Language System pointer--; Gives a confusing "Abort in pass 2" error message ("Missing code table entry for CLN"). The problem is that the syntax analyser does not parse ternary expressions ("a = (b > c) ? d : e") correctly; accepting the colon without a preceeding '?'. The expression tree which is subsequently built cannot be processed by the code generator. Since symbols are of finite length, there can be cases where a program compiles correctly, but fails in the assembler (AS) module. For example, int watchrule; int watchrules; Gives a "multiply defined symbol" error in AS, rather than in CC. ** ** External functions must be marked extern ________ _________ ____ __ ______ ______ ** External functions must be declared as: extern int func(); Writing int func(); will result in a syntax error message. ** ** Long composite constants compile incorrectly ____ _________ _________ _______ ___________ ** The compiler incorrectly computes long constants such as "1<<16". These must be computed by hand and entered as exact values. There are two problems here. One is that the compiler computes the above expression in "int" precision, assigning a garbage value to the long constant. If you "fix" this, by writing "1L<<16", the compiler will reject the compile-time computation of long expressions, which is currently unsupported -- see next entry. ** ** Long constant expressions are not optimized ____ ________ ___________ ___ ___ _________ ** If a program contains a constant expression involving longs,  CC Reference Manual Page A-3 Bugs in the Decus C Language System such as: func(1L + 1L); the compiler does not generate func(2L); What is more serious is that compiling: long value = 1L + 1L; results in a "bad initializer" error message. While the fix is straightforward (to routine MODIFY: in CC201.MAC, in case you're interested), it is a pain in the neck to implement. ** ** Repeated formal parameters ________ ______ __________ ** If a function definition contains the same formal parameter twice, as in: func(foo, bar, foo) the compiler does not generate an error message. ** ** Intermediate files written to user's current directory ____________ _____ _______ __ ______ _______ _________ ** Early versions of RSX CC (from the Decus SIG distribution from 1978) wrote intermediate files and the compiler output (.s) file on the same disk/directory as the input file. This has been changed so as to write all output files onto the user's current directory. This is compatible with RT11 CC and general PDP11 practice. For example, assuming RSTS/E: Command Old New xcc [100,100]foo [100,100]foo.s sy:foo.s Note that RT11 CC does a "normal" CSI scan, thus allowing placement of all files. ** ** Error processing relative branches in AS _____ __________ ________ ________ __ __ ** The AS assembler does not process self-relative branches (br .+4) correctly. They must not be used. (Code generated by CC does not contain relative branches.)  CC Reference Manual Page A-4 Bugs in the Decus C Language System ** ** Errors in C parser ______ __ _ ______ ** if (...) do { ... } while (...); else ... is rejected with the message "illegal else". Rewrite as: if (...) { do { ... } while (...); } else ... This makes the syntax explicit for the parser (and programmer). The parser may incorrectly assign expression type. For example, char c, *str; int foo, mask; foo = mask | (c = *str++); The parser thinks that "(c = *str++)" is of type "pointer" and cannot compute the inclusive or operation. ** ** Old-style assigned binary operators _________ ________ ______ _________ ** The C compiler really and truely does not support the old assigned binary operators (=+, =-, etc.). In fact, the statement "foo=-6" is exactly equivalent to "foo = (-6)"; it IS NOT equivalent to "foo = foo - 6". The Unix compiler processed the above ambiguous case by "taking the longest matching lexical entity." If the statement was written with embedded blanks as "foo = -6", the Unix compiler would, in effect, recognize the blank space between '=' and '-'. Note that rejecting '=-' is, an error in Decus C. This same problem may be seen in statements such as "foo=*x" which Decus C interprets as "foo = *x" while the Unix (or Vax-11 C) compiler interprets as "foo =* x". The C syntax definition ** ** Structures must be defined globally __________ ____ __ _______ ________  CC Reference Manual Page A-5 Bugs in the Decus C Language System ** The C compiler rejects structure definitions within the body of a function: foobar() { struct foostr { int *foo; } bar; } However, if the structure definition is moved outside the function body, it will compile correctly: struct foostr { int *foo; }; foo() { foostr bar; } Note that this means that structures may not be defined one way in one function and another way in another function. ** ** Random global symbols may prevent task-building ______ ______ _______ ___ _______ _____________ ** The RSX file services emulator library under RSTS/E contains a global symbol "EOF". Consequently, C programs running under this library must not define a global by this name. The following will not task-build correctly on RSTS/E, RSX-11 mode: int eof; main() { ... } Note that this restriction is due to the global's presence in the operating-system library, not the C run-time library. In general, note that global symbols that match file services support globals may cause conflicts with other file services routines. This will generally result in undefined global symbols when task building. For example, functions should not begin with a leading underscore '_' to avoid matching RSX file-control service routine names. ** ** Structure definitions must be ordered _________ ___________ ____ __ _______ **  CC Reference Manual Page A-6 Bugs in the Decus C Language System The C compiler is restrictive as to the ordering of definitions. For example, given the following structure definition: struct stack { int max_index; int current_index; int *vector; }; The compiler rejects the following sequence: struct stack entry = { DATUM_MAX, 0, datum }; int datum[DATUM_MAX]; (as datum is not yet defined). However, it accepts the following sequence: int datum[DATUM_MAX]; struct stack entry = { DATUM_MAX, 0, datum }; ** ** Fwild requires a sorted directory on RSX-11M _____ ________ _ ______ _________ __ _______ ** Fwild/fnext will not properly process versions ;0 and ;-1 on native RSX systems that support FILES-11 (ODS-1) disk directory structures. The algorithm works correctly on ODS2 structures (and thus on VMS compatiblity mode). The problem is that, on ODS-1, files are not sorted in the directory. A Decus program, SRD.TSK, may be used to sort ODS-1 directories as needed. SRD is not included on the current Decus C distribution because of size limitations. ";-1" versions do not seem to work correctly. ** ** Strange RSX-11M file and record formats _______ _______ ____ ___ ______ _______ ** Several RSX-11M and VAX utility programs generate output files that are not correctly processed by the C library. For example, VMS "print-image" files may encode blank lines in the record sequence field. As currently coded, the stream I/O routines (getc(), fgets(), etc.) will drop such lines. Also, some utility programs write "unformatted" files, with multiple text lines stored in one (file-system) logical record. (Task-builder map files and output from Runoff offer examples of this phenomenon.) In this case, sequential file reading is no problem but, if it is necessary to randomly process the file using ftell() and fseek(), you should note that the  CC Reference Manual Page A-7 Bugs in the Decus C Language System record-address returned by ftell() refers only to the file-system logical record -- i.e., it will not distinguish between the various text lines. The octal-dump program, OD.C (in the tools package), will print record sequence numbers if they are present. The file typeout program, T.C (in the tools package), illustrates deblocking the various file formats and random access to multiple text lines. Because stdout and stderr are assigned to different channels (luns), mixing output to stdout and stderr may result in inadvertant overprinting of output lines. Prefacing error message output with a newline '\n' should prevent this problem. ** ** Library Name Changes _______ ____ _______ ** In order to bring the library more in line with other C standard I/O libraries, (primarily the VAX native C compiler), a few library functions have changed names. Sorry for the inconvenience. Old name Becomes iovtoa fgetname Return file name sbreak sbrk Get incremental memory fmkdl (obsolete) Use delete() in new programs delete Delete a file, given its name ftty isatty TRUE if a terminal -- see below flun fileno Logical unit number -- see below Logical unit numbers used in Decus C bear no relation to Unix file numbers. In particular, stdin, stdout, and stderr are not 0, 1, and 2. In fact, on some operating systems, stderr equals -1. The proper test for "is this device a terminal" is now: isatty(fileno(fd)); ** ** Fseek/Ftell Changes ___________ _______ ** A long-standing error in the RSX versions of fseek() and ftell() was corrected. Unfortunately, this means that any data files using cross-file pointers must be rebuilt from scratch. (An index file built using the old version of ftell() cannot be used with a program linked with the new version of fseek().)