.c ;*************************************** .c ;* PRELIMINARY -- NOT OFFICIAL RELEASE * .c ;*************************************** DECUS C LANGUAGE SYSTEM DECUS C Compiler Reference Manual by David G. Conroy Edited by Martin Minow, John D. Morton and Robert B. Denny This document describes the CC compiler itself (including imple- mentational quirks and known bugs), along with procedures for compiling and executing programs under a wide variety of Digital operating systems. DECUS Structured Languages SIG Version of 17-Sep-80 (PRELIMINARY) Copyright (C) 1980, DECUS General permission to copy or modify, but not for profit, is hereby granted, provided that the above copyright notice is included and reference made to the fact that reproduction privileges were granted by DECUS. The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation or by DECUS. Neither Digital Equipment Corporation, DECUS, nor the authors assume any responsibility for the use or reliability of this document or the described software. This software is made available without any support whatsoever. The person responsible for an implementation of this system should expect to have to understand and modify the source code if any problems are encountered in implementing or maintaining the compiler or its run-time library. The DECUS 'Structured Languages Special Interest Group' is the primary focus for communication among users of this software. UNIX is a trademark of Bell Telephone Laboratories. RSX, RSTS/E, RT11 and VMS are trademarks of Digital Equipment Corporation. CHAPTER 1 INTRODUCTION CC is a multipass C compiler for the PDP-11 that runs under the RSX-11, VMS (compatibility mode), RSTS/E, and/or RT11 operating systems. Except for the restrictions noted in a later section, it compiles programs as per the description of C in the Unix Seventh Edition documentation or the book The C Programming ___ _ ___________ Language by Brian Kernighan and Dennis Ritchie (Englewood ________ Cliffs, NJ: Prentice-Hall, ISBN 0-13-110163-3). In general, the code produced by this compiler is quite well optimized for the PDP-11. Quality of the generated code is, however, dependent on the programmer's understanding of both the language and of the target machine (the PDP-11). In particular, proper use of register variables and the prefix '--' and postfix '++' operators with pointers can result in surprising reductions in size and increases in speed. Experience is the best teacher. CHAPTER 2 USING THE C COMPILER Since the C compiler runs on so many operating systems, command information is presented in individual sections for the various operating system families, followed by a common section describing usage and the switches needed to control compilation. 2.1 VMS, RSX-11, and RSTS/E RSX emulation mode ____ _______ ___ ______ ___ _________ ____ After the appropriate setup sequence (described in a later section) has been executed, the compiler may be invoked as follows: XCC [-switches] file or RUN C:XCC CC> [type command line here] The specified file is compiled and the resulting assembly code is placed in a file having the same name as the source file but with a filetype of 'S'. The default filetype for source files is 'C'. The file will be written to the user's current default account. On RSTS, this is the account under which the user is logged in. Diagnostics are written to the standard output. The diagnostic stream may be redirected by means of the '>' or '>>' conventions: '>filename' writes diagnostics to the named file, while '>>filename' appends diagnostics to the named file. This is compatible with Unix usage. Only a single file may be compiled at one time. Wildcards are not legal in names. The resulting assembly language is assembled with AS as follows: XAS -d file The generated code should never have any assembly errors. The '-d' switch deletes the input file ('file.s') unless an error is CC Reference Manual **PRELIMINARY** Page 2-2 Using the C compiler detected. Note that it is not possible to RUN XAS. Object files are compiled into executable images by using one of the RSX-11M task builders. The simplest command sequences possible on native RSX-11M are: >FTB prog/CP=objects,[1,1]C/LB >TKB prog/CP=objects,[1,1]C/LB Alternatively, on VMS, RSX-11, or RSTS/E RSX, the task builder may be invoked explicitly: TKB>prog/CP,map=objects,[1,1]C/LB (native RSX) TKB>// TKB>prog,map=objects,C:C/LB (VMS, RSTS/E RSX) TKB>// NOTE On native RSX-11M, the C OTS is normally kept on UIC [1,1], and cannot be referenced as "C:C/LB". On RSX-11M PLUS, there is a 'libuic' on which the C library would be kept, and may not be [1,1]. On the other systems the library may be referenced as "C:C/LB". If a program uses large amounts of automatically-allocated storage, the "STACK = number" option should be specified to the task builder. A C program may be built with the 4K FCS resident library FCSRES using the "LIBR = FCSRES:RO" option. 2.2 RT11 or RSTS/E RT11 emulation mode ____ __ ______ ____ _________ ____ After the setup sequence described in a later section has been executed, the compiler may be run as follows: RUN C:CC file [/switches] (native RT-11) CC file [/switches] (RSTS/E) or RUN C:CC CC> file [/switches] or CC> file.s,file.tm1,file.tmp=file.c [/switches] The latter case explicitly creates and saves the intermediate code (.tm1) and expanded source (.tmp) files. Normally, these are needed only when debugging the compiler. Note that if you CC Reference Manual **PRELIMINARY** Page 2-3 Using the C compiler do not specify extensions for the intermediate files, they will be given the default of '.tmp' for the expanded source and '.tm1' for the intermediate code file. The resulting assembly language is assembled with AS as follows: RUN C:AS file/d (RT-11) AS file/d (RSTS/E) or RUN C:AS AS> file/d The generated code should never have any assembly errors. The '/d' switch deletes the input file ('file.s') unless an error is detected. Object modules are compiled into executable images by using the RT11 linker: LINK/BOT:2000 prog,objects,C:(SUPORT,CLIB) (RT-11) LINK save,map=objects,C:SUPORT,C:CLIB/B:2000 (RSTS/E) The two library files contain the actual main program (in SUPORT) and the RT11 run-time support library. The start address must be at least 2000 to allow for dynamic storage by subroutines. If the '/BOTTOM' option or the '/b' switch is omitted, executing printf() may cause the program to abort with an 'M-trap to 4' message. 2.3 Compilation notes ___________ _____ MACRO-11 may NOT be used to assemble the output of CC. CC expects that its assembler can perform certain optimizations (most notably branch adjustment) not performed by MACRO-11. The title of the object file will be set to the first six characters of the source file name. This is of interest only to people who load overlaid programs off libraries. The compiler writes on files 'file.TMP' and 'file.TM1'. It is, therefore, unwise to keep important things in files with these filetypes. The '.TMP' file contains the C source with #include and #define statements processed. This is the input to the compiler proper. The '.TM1' file contains the intermediate code generated by the compiler parser. This is the input to the code generator. CC Reference Manual **PRELIMINARY** Page 2-4 Using the C compiler 2.4 Switches ________ Under RSX modes, switches are given as single letters preceeded by a minus sign: XCC -v -s test Under RSTS/E or RT11, switches are given as single letters preceeded by a slash: RUN C:CC test/v/s (native RT-11) Case is not significant. The following switches are defined: d This argument causes the compiler to execute a breakpoint trap when entering each overlay segment. It is used only for debugging the compiler. e This optional argument causes in-line code to be generated for multiply, divide, xor, and shift operations. NOTE: the current compiler recognises this switch, but does not generate in-line code. f This optional argument causes in-line code to be generated for floating-point operations. NOTE: the current compiler recognises this switch, but does not support floating-point. Any attempt to compile floating-point operations will result in a fatal compilation error. i This optional argument causes the compiler to retain the intermediate file (phase 1 to phase 2). This file is normally deleted. This option is for compiler maintanence. l This optional argument causes internal code trees to be written (as comments) to the .S output file. This option is for compiler maintanence m This optional argument causes timings of each pass to be printed. This option is only operative on RSX-11 modes. It requires hardware EIS. p This optional argument causes profiling code to be compiled (see the section on profiling). s This optional argument causes the compiler to retain the expanded source file (phase 0 to phase 1). This file is normally deleted. This option is for compiler maintanence. CC Reference Manual **PRELIMINARY** Page 2-5 Using the C compiler v This optional argument causes the compiler to echo the current line of the source onto the error stream whenever an error is detected. In most cases, this is not the line containing the error, because the parser usually has to read the next symbol of the source to determine that an error exists. It will usually be within 1 line, which should be close enough to locate the error. 2.5 Setup of the compiler _____ __ ___ ________ Before using the C compiler, it must be made known to the operating system. This differs slightly for the various systems. 2.5.1 Setup under VMS _____ _____ ___ The following setup (or something much like it) should be added to your LOGIN.COM file: $ ASSIGN DBA0:[PUBLIC] C $ XCC :== $C:CC.EXE CC $ XAS :== $C:AS.EXE AS The above enables use of the above-mentioned command sequences. If your compiled C program is to make use of the (Unix-compatible) startup sequence, you must proceed as follows: $ XCC foo $ XAS -d foo $ MCR TKB foo,foo=foo,c:c/lb Then, you must type: $ FOOBAR :== "$DISK:[ACCOUNT]FOO.EXE " $ FOOBAR Unix-style parameters The '$' tells the VMS command interpretor that a command is being defined. Note that a dummy parameter must be specified. This will become the 'task name' (argv[0]) when the program starts. CC Reference Manual **PRELIMINARY** Page 2-6 Using the C compiler 2.5.2 Setup under RSTS/E RSX emulation mode _____ _____ ______ ___ _________ ____ Under RSTS/E, the system manager must define the XCC and XAS CCL commands and the C: system-wide logical in a start control file such as the following (the account may be chosen to meet the system manager's needs): RUN $UTILTY ? ADD LOGICAL SY:[5,2]C ? CCL XAS-=C:AS.TSK;0 ? CCL XCC-=C:CC.TSK;0 ? CCL MCR-=C:MCR.*;30000 ? EXIT 2.5.3 Setup under RSX-11M _____ _____ _______ As it is assembled, the CC compiler looks for #include files of the form '' on logical device 'C:'. This will not work on RSX-11M, so the distributed compiler build file does a 'GBLPAT' to the location labeled 'SYSINC' to change it to "LB:[1,1]". On an RSX-11M PLUS system, you should change this to your 'libuic' if necessary by editing MMAKCC.CMD. Install CC and AS as MCR external commands '...XCC' and '...XAS', respectively. The CC compiler MUST be installed ____ checkpointable in a mapped system to allow for task extension. If you have an unmapped system, or do not have the 'extend task' directive in your executive, install CC with an 'INC=20000' at least, more if you get compiler aborts. 2.5.4 Setup under RT11 and RSTS/E RT11 mode _____ _____ ____ ___ ______ ____ ____ Under RT11, setup consists of simply ASSIGNing a physical device to the logical device "C:". The compiler and assembler .SAV files, the SUPORT.OBJ module, and the library CLIB.OBJ should be placed on device 'C:'. You can make the assignment of device 'C:' as part of the startup command file, e.g.: .ASSIGN RK0: C: This compiler has been built and used under RT-11 V3B and V4. It has run on a PDP-11/34, a PDP-11/05 and on PDT150 systems. Under RSTS/E, the system manager must execute a startup control file such as the following: RUN $UTILTY ? ADD LOGICAL SY:[5,2]C ? CCL AS-=C:AS.SAV;8192 CC Reference Manual **PRELIMINARY** Page 2-7 Using the C compiler ? CCL CC-=C:CC.SAV;8220 ? CCL MCR-=C:MCR.*;30000 ? EXIT 2.6 Invoking compiled C programs ________ ________ _ ________ When your program begins to execute and the startup module sees that a command has been typed, a Unix C setup sequence is emulated, including I/O redirection and command argument processing. The startup module does not expand wild-card filenames, however. NOTE On RSX-11M, this feature cannot be used unless your program is installed as an MCR external command, i.e. with a task name of '...xxx', and activated by typing the "xxx". This requires that you be a priveleged user. On RT-11, if no command line has been passed, the module prompts "Argv: " and accepts a single line which is then parsed into command arguments. This can be disabled by defining the $$narg global symbol as described in the library documentation. NOTE On native RT-11, a command line passed via "RUN prog ..." which has more than one 'token' or 'word' in it gets parsed by the RT-11 monitor before it ever gets to the C program. See the documentation in the RT-11 manual on the 'RUN' command. It causes an "=" sign to get inserted, and the order of arguments is shuffled. To get around this, either use the "RUN prog" and answer the "Argv: " prompt with the command line, or enclose the command line in some delimiter plus a space, e.g.: RUN C:prog [ command line ] which tacks an extra token on to the command line that looks like "]=[" for the case above. There is no problem on command lines which have one token. If you include an argument of the form '>file', standard output will be written to the indicated file. If you include an CC Reference Manual **PRELIMINARY** Page 2-8 Using the C compiler argument of the form '>>file', standard output will be appended to the file (creating it if necessary). Append does not work on RT11-modes. If you include an argument of the form ' parameter in the command definition as shown above. o On RSTS/E, this will be the CCL name or the program name as passed to the MCR program. o On RT11 (or on RSTS, by default, if no name can be found), this will be the string 'Argv: '. For example: /* * Echo arguments */ main(argc, argv) int argc; char *argv[]; { register int i; printf("Program \"%s\" has %d parameters\n", argv[0], argc); for (i = 1; i < argc; i++) printf("Argument %d = \"%s\"\n", i, argv[i]); } The above program is executed as follows on VMS: $ ECHO abc "def ghi" CC Reference Manual **PRELIMINARY** Page 2-9 Using the C compiler Program "ECHO" has 3 parameters Argument 0 = "ECHO" Argument 1 = "ABC" Argument 2 = "def ghi" Notice that unquoted arguments are converted to upper case by the operating system. Under RSTS/E, a C program may be installed as a CCL command or the program may be started using the MCR CCL command which emulates a CCL invocation for C programs. 2.7 Predefined symbols __________ _______ Before reading the program source file, the C compiler defines several symbols (which may then be tested with '#ifdef' statements): decus This is the Decus compiler. nofpu This version does not support floating-point. nomacarg This version does not allow macros with arguments. pdp11 Generate code for the PDP-11. rsx The RSX compiler (or) rt11 The RT11 compiler 2.8 Profiling _________ The profiler permits the accumulation of function call statistics during the execution of a program. If any of the files comprising a program were compiled with the profile option (and at least one of them has been called) then a call profile, listing the function name and the number of calls, will be written to file 'profil.out' when the program terminates. Also, if the program terminates because of a fatal error (such as an illegal memory reference), a register dump and call trace will be printed on the command terminal. The run-time library contains several functions that can be called to dynamically print flow trace information. For more information, consult the C Runtime Library manual. CC Reference Manual **PRELIMINARY** Page 2-10 Using the C compiler 2.9 Diagnostics ___________ There are two general classes of diagnostics; those that relate to compiler conditions, and those that relate to errors in the user's program. The only type of compiler condition messages the user should see are those of the form "Cannot open .... file". These mean exactly what they say. Other compiler condition messages are "Abort in phase x", "Abort loading phase x" and "Trap type x", where "x" is replaced by some small constant. These are most likely attempts to use floating-point operations. If not, you are the proud owner of a compiler bug. Report your find to a guru. Remember the register dump and save your source file and both temporary files. They are important. If you blunder into a missing code table the compiler aborts with an error message. Errors in the user's programs are reported in English, tagged by the linenumber (which may be off by 1). Because of the nature of the language, errors sometimes snowball. If you are greeted by thousands of error messages, try fixing up the first few. You may be pleasantly surprised. The following are common sources of 'thousands of errors': o If there is a missing right brace within a function, all succeding functions will miscompile. The error message will include a tag of the form "within function xxxxx", where "xxxxx" is the function with the missing brace. o If there is a missing right parenthesis in an if or while statement which is followed by a left brace, the syntax analyser will 'lose' the brace, causing many messages: if ((foo = fopen("abc.def", "w") == NULL) { ... o In general, if the error message is "illegal expression", that is (probably) the current line. If the message is "illegal statement", you should look at the previous statement. CHAPTER 3 RUNTIME ENVIRONMENT This description of the C runtime enviornment is sketchy. The best reference is compiler generated code, and any question regarding 'how does it ....' can usually be answered by compiling a suitably contrived program. 3.1 Program Sections _______ ________ The C compiler uses 5 program sections. The '.PROG.' p-section is used for all code. The '.DATA.' p-section is used for all static data. The '.STRN.' p-section is used for the bodies of all literal strings. The '.PROF.' psection is used to hold the names of functions for the profiler. The '.MWCN.' psection is used to hold multi-word (long and floating-point) constants. All code is 'pure'. However, the assembler is not able to generate all the varieties of .PSECTs. Thus, everything is read-write. This should be changed. Also, the compiler does not write a symbol table as such, making debugging a chore. 3.2 Register Usage ________ _____ R5 is used as an environment frame pointer. It points to the highest address of the stack frame of the current function. In MACRO-11 programs, symbols C$PMTR and C$AUTO may be used to refer to the first parameter and first automatic variable, respectively. Thus, when writing a MACRO subroutine, you should write: MOV C$PMTR+(R5), Dst to access parameters (the first parameter_number is 0). (This cannot be done when using the AS assembler.) To access automatic variables, you should write: MOV C$AUTO-(R5), Dst CC Reference Manual **PRELIMINARY** Page 3-2 Runtime Environment Where the first variable_number is numbered 1. (This cannot be done when using the AS assembler). Registers R2, R3 and R4 are used as register variables. The first register variable to be declared goes in R4, the second in R3 and the third in R2. Any register not used as a register variable can be used as a temporary. Registers R0 and R1 are always scratch registers. 3.2.1 Calling Sequence _______ ________ The first instructions in a C function are a 'JSR R5,CSV$' and a subtract to claim stack space. The 'CSV$' routine points R5 at the new stack frame and pushes registers R4, R3, and R2 onto the stack (Note that the character '$' in the CC/MACRO environment, is represented by '~' in the AS environment). R0, R1 and the floating point registers are NOT saved. This means that if a C function is called asyncronously (i.e. from an AST routine) the caller must arrange to save these registers or be prepared to face the music. Functions return via a 'JMP CRET$'. The return value is in R0 (for ints, chars and pointers), R0-R1 (for longs, high part in R0) or AC0 (floats and doubles). The caller passes control to a function by first pushing the arguments (from right to left) onto the stack, calling the function via a 'JSR PC,FUNCTION', and popping the arguments off of the stack when the function returns. All arguments are passed as ints, longs (push low part, then push high part) or doubles. Characters are passed as integers; floats are passed as doubles. CC Reference Manual **PRELIMINARY** Page 3-3 Runtime Environment 3.3 Global Symbols containing RAD50 '$' and '.' ______ _______ __________ _____ ___ With this version of C, it is possible to generate and access global symbols which contain the Radix-50 '.' and '$'. The compiler allows identifiers to contain the Ascii '$', which becomes a Radix-50 '$' in the object code. The AS assembly code shows this character as a tilde (~). The underscore character () in a C program becomes a '.' in both the AS assembly language and in the object code. Thus, in RSX-11M, it is possible to say extern int $dsw; . . . printf("Directive status = %06on", $dsw); which will print the current contents of the task's directive status word. NOTE Be careful about using global 'equates' in C. These are NOT address labels. For example, if you declare "extern int is_suc;", where IS.SUC is externally equated to 1, and then use is_suc in an expression, you will get the contents of location 1 (and probably ________ __ ________ _ an odd address trap!). It is possible (but very tacky) to get around this by prefixing the use of the equated symbol with the '&' operator, since it means 'take this literally, not what it points to'. Consider #define'ing the symbols in a C header file instead. 3.4 Virtual Addresses in C _______ _________ __ _ When interacting with executives and MACRO-11 programs at the low level made possible by C, it is likely that virtual addresses will be manipulated and used as pointers. This is particularly true when using the memory management functions. Also, the C storage allocator functions return virtual addresses, not C pointers. It is important to make this distinction, owing to C's powerful address arithmetic capabilities. See The C Programming Language by Kernighan and ___ _ ___________ ________ Ritchie, sections 5.4 and 5.6. It is a kluge to define a virtual address as an integer. Since virtual addresses on the PDP-11 are 'pointers to bytes (or characters)' it is wise to adopt the convention of defining them as character pointers. To make things crystal clear, one might #define "typedef char *ADDR;", making ADDR synonymous with 'character pointer'. CC Reference Manual **PRELIMINARY** Page 3-4 Runtime Environment 3.5 Profiler ________ When a program is compiled with the 'p' option, the standard save is replaced by a "JSR R5,PCSV$". Immediately following the call is a pointer to a zero word (for the counter) followed by the name of the function (in the '.PROF.' psection as a null terminated string). The 'PCSV$' routine increments the zero word on every call: .psect .prog. entry: jsr r5,pcsv$ .word prof .psect .prof. prof: .word 0 ; Incremented at each call .asciz /entry/ ; Function name .even .psect .prog ... The printing of the profile is arranged by having 'PCSV$' stuff a global cell '$$PROF' with a pointer to the profile print routine. This routine (called automagically on exit) scans through core looking for "JSR R5,PCSV" instructions, and printing the statistics to the file 'profil.out' via 'fprintf'. The trace module has several other attributes: o If the program fails because of an unexpected trap to the operating system (and the profile collection code was executed at least once), a register dump will be printed on the command terminal and the program will exit by calling error(). o If the function's execution would cause the stack pointer to go below 600 octal, the program will be aborted after printing an error message. o It is possible to obtain a dynamic trace of the flow of a program by assigning the file descriptor of an open file to global variable '$$flow'. For example: #include extern FILE *$$flow; main () { $$flow = fopen("trace.out", "w"); process(); } Note that the program may execute CC Reference Manual **PRELIMINARY** Page 3-5 Runtime Environment $$flow = stdout; to write the trace to the command terminal. To turn off tracing, close $$flow and set $$flow = NULL. o The caller() function may be used to obtain the name of a routine's caller: main () { subr(); } subr () { printf("%s\n", caller()); } When subr() is executed, it will print "main". o The calltr() function may be used to print a trace of calls from main() to the function that called calltr(): main () { subr(); } subr () { calltr(stdout); } When subr() is executed, it will print: [ main subr ] on the standard output file. If some routine in the call trace was not compiled with profiling, the octal address of the routine's entry point is printed. If the routine gets confused (perhaps because the program is exiting due to a trap), it prints "". o If the program exits by calling error() and the profile collection code was executed at least once, a call trace will be printed on the command terminal. CC Reference Manual **PRELIMINARY** Page 3-6 Runtime Environment 3.5.1 Example A function max(a, b), which returns the maximum value of its two integer arguments may be written as follows: max(arga, argb) int arga; int argb; { return((arga > argb) ? arga : argb); } After compilation, the following .S code will be generated: max: jsr r5,csv$ cmp 2(sp),4(sp) blt .0 mov 2(sp),r0 br .1 .0: mov 4(sp),r0 .1: jmp cret$ CHAPTER 4 INCOMPATIBILITIES AND RESTRICTIONS The language accepted by the compiler is the language described in the Unix Seventh Edition documentation (and Kernighan and Ritchie) with several exceptions. The file 'C:CBUGS.DOC' contains a current list of bugs. These should be regarded as restrictions -- anything that was easy to fix has been fixed. 4.1 Restrictions ____________ o The AS assembler recognizes several pre-defined variables. Consequently, the following may not be used by a C program: 'r0, r1, r2, r3, r4, r5, sp, and pc'. o Initialization of automatic and local static variables is not supported. o Enumerations are not supported. o Bit fields do not work -- attempting to use bit fields will cause the compiler to abort with a 'missing code table entry' error. o Symbols defined as global may not be redefined as local to a function. o Variables may only be declared at function entrance. The latest C language specification allows variable declaration at any block entrance. o Floating point is non-existant. If you attempt to compile a program that uses floating-point, the compiler will abort with a suitable message. o The compiler does not support 'old-style' assigned binary operators. These will generally result in syntax errors. One exception (which started the whole mess) is "foo =- 6". This will be accepted by the compiler. Unfortunately, it will generate "foo = (-6)" when the program probably wanted "foo = foo - 6". You have been CC Reference Manual **PRELIMINARY** Page 4-2 Incompatibilities and Restrictions warned. o The compiler allocates storage to character variables as if they were integers (except if the character variable is an array or part of a structure). If single-byte allocation is necessary, the program should proceed as follows: char chara[1]; /* Declare one-byte character */ #define A char_a[0] /* Name first byte of char_a[] */ o The include statement has two modes: #include "filename" Includes the fully-qualified file. #include Includes the library file, equivalent to: #include "C:filename" (LB:[1,1] on RSX) o Macros (#define statement with arguments) do not exist. o As noted in the library documentation, the following built-in function may be overridden by the C-program: wrapup() Called when the program exits. 4.2 Incompatibilities _________________ There are several incompatibilities between the current DECUS compiler and earlier versions which had been distributed by various DECUS special-interest groups. Those known (and the implications) are: o The RSX compiler's subroutine calling sequence has been changed to match the RT11 compiler's (and Unix's). This means that all user-written assembly-language code must be modified. The calling sequence appears to be compatible with the Unix and Whitesmith compilers, although library names are different. Also, the Whitesmith compiler has several optimizations in its subroutine calling sequence that are not present in this compiler. o The underscore character now generates a RAD50 dot, instead of a dollar-sign. The compiler allows dollar-signs in local and global variables. Thus, C programs can now access all PDP-11 global symbols. CC Reference Manual **PRELIMINARY** Page 4-3 Incompatibilities and Restrictions Because of the change of the meaning of underscore, all user-written assembly-language code must be modified. o I/O library conventions now generally follow the Unix V7 definitions. There are several implications. In general, however, all C-language I/O calls should be examined. The major problems are described below. o fopen("filename", "openmode") follows the RSX-library and Unix V7. This is incompatible with Unix V6 and the old RT11-library call. o fgets(buffer, sizeof buffer, fd) requires the second buffer size parameter, and does not remove the trailing newline. This follows Unix V7 I/O conventions. fgetss() is a new function, identical to fgets() except that it removes the trailing newline. fgetss() is compatible with the fgets() function in previous versions of the Decus compiler. o fputs(buffer, fd) does not append a newline to the record. This follows Unix V7 I/O conventions. fputss() is a new function, identical to fgets() except that it appends a trailing newline. o The "execute non-local goto" functions have been renamed. Unix V6 reset() and setexit() (Unix V7 longjmp() and setexit()) are called reset() and unwind() in this release. Two new functions, envsave() and envreset() are also present for this purpose. o The ctime() function (return time of day in Ascii) does not return a trailing newline. To get the time of day, the program may execute ctime(0). 4.2.1 Conversion from Unix It is expected that many programs will be converted to the Decus compiler from Unix. While trivial programs will require no work whatsoever, most programs will require hand editing. Note the following: o Floating point is non-existent. Many floating-point variables can be recoded as long integers (large counters, for example). Anything else cannot be converted at all. o Unix V6 assigned binary operators must be converted to CC Reference Manual **PRELIMINARY** Page 4-4 Incompatibilities and Restrictions the new format. Most of these will be caught by the syntax analyser. Note, however, that "foo =- 6" will parse, generating incorrect code. o The Decus compiler has a 500 word expression stack. This means that many complex expressions (especially those with embedded conditional statements) will cause the compilation to abort. This requires rewriting. o The Decus compiler lacks macros with arguments. Many of these can be rewritten as function calls. If the program intentionally makes use of the fact that macros are expanded in-line, hand-editing will be needed. Note also that only one level of indirect (#include) file is supported. o Unix V6 I/O is not supported. Thus, any program using read(), write(), open(), or creat() will require extensive modification. Note also that fopen() operates quite differently in the standard I/O package than it did in Unix V6. This requires rethinking but is fairly straight-forward. Also, note that only a limited file random-access capability is present. o Very large programs (which depend on Unix's ability to generate programs with seperate instruction and data space) must be redone using the linker (task-builder) overlay capability. Non-trivial. o Programs that use large amounts of local storage (allocated on function entrance) must be linked with enough stack space. When testing a program, it is highly recommended that the program be compiled with profiling as this enables a stack overflow check on function entrance. In general, the programmer should be alert to such minor incompatibilities that do exist. APPENDIX A FILE CBUGS.DOC (17-Sep-80) The following is a reproduction of the CBUGS.DOC file distributed with the DECUS C system, as of 22-Sep-80: ** 03-May-80 __ _________ The construction return((c < 0) -1 : 0); (with a missing "?") aborts in phase 2. It should yield a syntax error message. ** 07-May-80 __ _________ The compiler outputs a spurious error message if you follow a declaration by an "extern" declaration: int foo; main() { ... } extern int foo; The extern definition is flagged as a "redeclaration". ** 13-May-80 __ _________ The compiler doesn't always handle typedef's correctly. Note the following: typedef struct foo *FOOPTR; struct foo { ... }; FOOPTR foofun() ... Foofun() gives a syntax error "declaration semantically CC Reference Manual **PRELIMINARY** Page A-2 File CBUGS.DOC forbidden". However, struct foo *foofun() works correctly. ** 19-May-80 __ _________ PDP-11 register definitions ("r0, r1, ... r5, sp, and pc") are generated by the compiler to refer to the hardware registers. Also, these are predefined by the AS assembler. Thus, you cannot name a function sp(), etc. ** 09-Jun-80 __ _________ Previous versions of RSX CC wrote int. files and the compiler output (.s) file on the same disk/directory as the input file. This has been changed so as to write all output files onto the user's current directory. This is compatible with RT11 CC and general PDP11 practice. For example, assuming RSTS/E: Command Old New xcc [100,100]foo [100,100]foo.s sy:foo.s Note that RT11 CC does a "normal" CSI scan, thus allowing placement of all files. ** 19-Jun-80 __ _________ Certain constructions don't get registers setup properly. For example, the following program crashes the compiler: long atol(s) char s[]; { long n; n = 10 * n + (s[0] - '0'); } As a temporary fix for this sort of problem, you can break the code into smaller units: long atol(s) char s[]; { long n; register int i; i = s[0] - '0'; n = 10 * n + i; } CC Reference Manual **PRELIMINARY** Page A-3 File CBUGS.DOC ** 23-Jul-80 __ _________ The AS assembler does not always process relative branches (br .+4) correctly. They should not be used. (Code generated by CC no longer contains relative branches.) ** 24-Jul-80 __ _________ Integer to long conversion is not always done the way one might expect. For example: longval = ((long) intvalue) ... Does not convert the integer to long (with sign extension), but rather uses a "garbage" high-order word. Also, longval = intval * intval; Converts the RESULT of the computation to a long, as if the program executed: inttemp = intval * intval; longval = inttemp; Moral: the correct way to proceed is: longtemp = intval; longval = longtemp ... Sorry. Note however that L is a long constant. Thus, longval = intval * 123L; works properly. ** 14-Aug-80 __ _________ Note the following errors in the C parser: char foo[][7] { ... }; subroutine() { extern char foo[][7]; ... The extern declaration is rejected. Also, CC Reference Manual **PRELIMINARY** Page A-4 File CBUGS.DOC if (...) do { ... } while (...); else ... Is rejected with the message "illegal else". Rewrite as: if (...) { do { ... } while (...); } else ... Sorry. ** 15-Aug-80 __ _________ The C compiler really and truely does not support the old assigned binary operators (=+, =-, etc.). In fact, the statement "foo =- 6" is equivalent to "foo = (-6)"; it IS NOT equivalent to "foo = foo - 6". ** 18-Aug-80 __ _________ The C compiler may reject structure definitions within the body of a function: foo() { struct { int *bar; }; } However, if the structure definition is moved outside the function body, it will compile correctly: struct { int *bar; }; foo() { ... } ** 12-Sep-80 __ _________ The RSX file services emulator library under RSTS/E contains a global symbol "EOF". Consequently, C programs running under this library may not define a global by this name. The following will not task-build correctly on RSTS/E, RSX-11 mode: int eof; main() { ... CC Reference Manual **PRELIMINARY** Page A-5 File CBUGS.DOC } Note that this restriction is due to the global's presence in the operating-system library, not the C run-time library. ** 15-Sep-80 __ _________ Fwild/fnext will not properly process versions ;0 and ;-1 on native RSX systems that support FILES-11 disk directory structures. The algorithm works correctly on ODS2 structures (and thus on VMS compatiblity mode). The algorithm will work on native RSX if the directory is sorted (by a program such as SRD) in order of decreasing version numbers. ** 15-Sep-80 __ _________ The C compiler is restrictive as to the ordering of definitions. For example, given the following structure definition: struct stack { int maxindex; int currentindex; int *vector; }; The compiler rejects the following sequence: struct stack datum { DATUMMAX, 0, datum }; int datum[DATUMMAX]; However, it accepts the following sequence: int datum[DATUMMAX]; struct stack datum { DATUMMAX, 0, datum };