This document describes DECSYSTEM-20 Pascal. This Pascal system is the result of cooperation among a number of different people. It was originally written at the University of Hamburg by a group of people under the supervision of Prof. H.-H. Nagel. This version was developed from Prof. Nagel's by Charles Hedrick, and is maintained by him. Lee Cooprider and others at the University of Southern California have been particularly helpful in supplying improvements, largely to the debugger. A number of compiler bug fixes were supplied by Andrew Hisgen at Carnegie-Mellon University. Charles Hedrick originally intended to produce a system that gave complete access to the facilities of the operating system. To do this, a number of procedures were added, and optional arguments were added to several existing procedures. These additions give you access to the full power of the DECSYSTEM-20's input output system, as well as to other facilities such as interrupt handling. While making these additions, Dr. Hedrick ignored a number of shortcomings in the design of the original compiler. More recently, the goal has shifted to producing a complete implementation of the language, with error handling and debugging appropriate for student use. The standard in this effort has been the PASCAL Revised Report. No attempt has been made to implement the changes proposed for the ISO standard. As a result of these two goals, this compiler is now appropriate for both system programming and instructional use. However it is still not an optimizing compiler, and should not be used for applications where high-quality code is important. This system is now intended to be a complete implementation of the language. The following are the only serious limitations. A complete list will be found in an appendix. - Procedures can be passed as parameters to another procedure. When calling a procedure that has been passed in this way, you can supply no more than 5 parameters. - Sets of character are not fully implemented. Lower case characters are treated as equivalent to the corresponding upper case letter when in a set. All control characters except tab are treated as equivalent in a set. This manual is intended as a complete reference manual for this implementation. As such it contains far more detail on extensions than many users will need. There is a somewhat briefer manual, which 2 is more suitable for the average user. Both manuals describe only features that differ from those documented in the Revised Report. So you should look at the Revised Report first. 3 1. How to use PASCAL-20 1.1 How to use the normal compiler The usual version of the compiler, PASCAL, follows the standard DECsystem-10 conventions for compilers, and can thus be invoked by COMPIL-class commands. (If your EXEC does not know about PASCAL, see the next section, where a special version of the compiler, called PAS, is described.) To compile and execute a PASCAL program TEST, you would issue the command EXECUTE TEST.PAS The usual COMPIL switches, such as /NOBIN, /LIST, and /CREF can be used. For other commands, such as COMPIL, DEBUG, and LOAD, see your EXEC documentation. If your program begins with a PROGRAM statement, execution will begin by asking you for file spec's for each of the files mentioned in the program statement. You should type a standard Tops-20 file spec, terminated with . Recognition is used. If the file already exists, the most recent generation is used by default. (It is possible to change this default so that a new generation is created. See section 2.2. If you type only a carriage return, a default file spec will be used. For INPUT and OUTPUT this is "TTY:", normally your terminal. For other files it is a disk file with the same name as the Pascal file name. If you assign the file INPUT to a terminal, you will normally be prompted for the first line of input by the Pascal I/O system, [INPUT, end with ^Z: ]. Because of oddities of the Pascal language, this initial read is done before your program has started. Hence you cannot issue a prompt first. This can be avoided by specifying INPUT 4 to be interactive (see below). Note that the effect of listing a file in the PROGRAM statement depends upon whether it is a predeclared file -- INPUT or OUTPUT -- or a file declared by the user. For a user-declared file, listing it in the PROGRAM statement simply provides an easy way to get a file specification for it at runtime. It does not open the file. That is, you must still do RESET or REWRITE on it. And you must still declare the file identifier in the VAR section of the program. However for the predeclared files INPUT and OUTPUT, listing them in the PROGRAM statement also causes the system to open them (RESET for INPUT, REWRITE for OUTPUT). If you choose not to use COMPIL-class commands, you should say @PASCAL *,=// ... Anything other than the source file may be left out. The defaults are: relfile: not produced if missing; if no extension: .REL listfile: not produced if missing; if no extension: .LST sourcefile: if no extension: .PAS Since the EXEC expects to be dealing with Tops-10 compilers, we have to supply a Tops-10 scanner. The usual limitations apply: - File names must be 6 characters for file name, 3 for file type - No version numbers may be used - If the directory must be specified, you must use a PPN, not a directory name. - No recognition is done on file names or switches 5 At the moment, if you want to supply compiler switches you must call the compiler and use the syntax just given. In release 4 the EXEC will have a syntax to allow switches to be passed to the compiler in the COMPILE, EXECUTE, LOAD, and DEBUG commands. Whether you run the compiler explicitly or pass switches through the release 4 EXEC, the following compiler switches are allowed: /ARITHCHECK - Turns on checking for arithmetic errors, i.e. divide by zero, overflow, and underflow. If this switch is not specified, the setting of /CHECK is used as its default. /CHECK - generates code to perform runtime checks for indices and assignments to scalar and subrange variables. Pointer accesses will also be checked for NIL or zero pointers. (Usually a zero pointer is the result of a pointer variable not being initialized.) Divide by zero, overflow, and underflow are also caught. All of these cases cause an error message to be printed and transfer to PASDDT1 or DDT if they are loaded. Note that .JBOPC is set up for DDT. However the program cannot necessarily be continued, because the AC's may be those of the error trapper. Also, in the case of an arithmetic error, you are at interrupt level in DDT. /CREF - generates information so that CREF can produce a crossreference listing. Changes default extension for the listing to .CRF. /DEBUG - generate information for the DEBUG package. This is normally on except for production programs. We strongly encourage people to turn this off (probably by putting the directive (*$D-*) in their program) when they know they have finished using PASDDT. The debug information may double the size of your program. /HEAP:nnn - sets the first address to be used for the heap. This is the storage used by NEW. It will begin at the specified address and go down. The only known use for this is if you intend to load the high segment at other than 400000. /MAIN - a main program part is present (see Section 4.4) 6 /OBJECTLIST - list the generated code in symbolic form /STACK:nnn - sets the first location to be used for the stack. This should be above the high segment (if any). The only known use is if you intend to do a GET to a larger segment and want to be sure the stack doesn't get in the way. /VERSION:vvv - must be given on the output side. This version number will be put in .JBVER unless overridden by a later directive. vvv is the usual DEC-10 version number, e.g. 3B(22)-4. /ZERO - Causes code to be compiled in every procedure and function prolog to initialize all of its local variables to 0. Also, NEW will initialize all records it generates to 0. This is useful mainly for programs with complicated pointer manipulations, since it guarantees that uninitialized pointers will be 0, which will be caught by the /CHECK code. Note that /ZERO and /CHECK can be set independently, however, and that /ZERO applies to all local variables, not just pointers. /ZERO will not cause global variables to be reinitialized, although they always start out as zero unless an initprocedure is used to give them another value. To get the opposite effect of a listed switch, type /NO. The default switch settings are /CHECK /DEBUG /MAIN /NOOBJECTLIST /NOZERO. For /STACK and /HEAP the arguments to the PASCAL compiler can be nnP, nnK, nnnnnn, or #nnnnnn. This specifies a core address in pages, K, decimal, or octal. The default values are 0, which causes 400000B to be used for the heap and the stack to be put immediately above the high segment. Values will be rounded up to the nearest page boundary. For the PAS compiler, described below, the arguments to /STACK and /HEAP are assumed to be octal numbers, with the same interpretation. 1.2 PAS: A Special Compiler for Unmodified EXEC's The instructions above assume that the EXEC has been modified to know 7 about the PASCAL compiler. If it has not, then you cannot use the EXECUTE, COMPILE, LOAD, or DEBUG commands to compile a PASCAL program. To simplify things for users at sites with an unmodified EXEC, we supply a second compiler, usually called PAS. This compiler has the most important features of the EXEC built into it. If your EXEC knows about PASCAL, you will have no need for PAS, and you should ignore this section. PAS differs from a normal compiler in that it rescans the command line. Thus you can use PAS as if it were a command, listing file names and switches on the line where you call it. Because of design limitations in the EXEC, recognition is not available for the command line. So you must type file names out, except that the file type .PAS will be assumed if you don't type one. Of course switches may be abbreviated. The default when you use PAS is that it will execute the program. So to execute a PASCAL program TEST.PAS, you would use the command PAS TEST Only one source file can be mentioned in the command. TEST.PAS would be compiled only if there has been a change since the last time it was compiled. Then LINK would be called to load and start the program. PAS has switches to select a few options other than simple execution. PAS TEST/DEBUG - debug with PASDDT PAS TEST/LOAD - compile and load; don't start PAS TEST/NOLOAD - compile only, don't load Note that the switch /DEBUG has a different intepretation in this context that was documented above when compiler switches were described. It causes the program to be compiled (if necessary), loaded with PASDDT, and PASDDT started. The following switches are also available in PAS: 8 /COMPIL - compile even if source is not changed /LIST - produce a listing file /CREF - produce a CREF listing file /OBJECT - produce listing file with object code /NOBIN - do not produce a .REL file All other compiler switches described in the previous section (including /NODEBUG) are also available and have the effect documented. If no arguments are typed on the line with PAS (i.e. if you just type the command "PAS"), you will get a prompt PASCAL>. PAS will then expect you to type the name of a source file to compile. Only a compilation will be done, i.e. no automatic execution. Of the switches mentioned in this section, only /LIST, /CREF, /OBJECT, and /NOBIN will be meaningful. Of course the normal compiler switches documented in the previous section will still be available. 1.3 Core Allocation PASCAL has dynamic core allocation. This means that memory will automatically expand if you do NEW a lot, or use a large stack. Thus if you get a message informing you that you have run out of core, it means that you used all of your virtual memory space. In such a case, you should reconsider your data structures or algorithm. Programs that do a lot of dynamic memory allocation should consider returning spaced used by structures they are finished with. Note that PASCAL makes no attempt to garbage collect unused structures. This means that the programmer must know when he is finished with a particular record instance and DISPOSE it. DISPOSE is a standard procedure, described in some editions of the Revised Report. It takes exactly the same arguments as NEW. However it returns the space used by the record pointed to. This space can then be reused by later NEW's. It is very important that any extra arguments you supplied to NEW in generating the record be supplied in the same way to DISPOSE. (These arguments are used only for variant records, to allow 9 allocation of space for a particular variant.) If you do not use the same parameters, you will get the error message: "DISPOSE called with clobbered or already-disposed object". In addition to checking validity of the disposed object in this way, the runtimes also check for disposing NIL or 0, and give an appropriate error message. If your program uses memory in a strictly hierarchical fashion, you may also find it possible to use the procedures MARK and RELEASE to deallocate memory (See section 3.7). RELEASEing an entire block of memory is more efficient than DISPOSEing of records one by one, though this efficiency is balanced by the fact that MARK and RELEASE are not part of official Pascal (though they are present in most implementations). Note that you get a completely different version of NEW when you use DISPOSE and when you do not. The system handles loading the right version of NEW automatically. The version used with DISPOSE is not compatible with MARK and RELEASE. It is also not compatible with use of the /HEAP switch (or $H directive) to start the heap at addresses above 377777 octal. If you do interrupt handling, note that DISPOSE, and the version of NEW used when DISPOSE is used, are not reentrant. Thus you should treat sections of code using NEW and DISPOSE as critical sections. (The easiest thing is to call ENTERCRIT and LEAVECRIT around all calls to NEW and DISPOSE.) This is only needed if you are using DISPOSE, and if you do either NEW or DISPOSE at interrupt level. (That is why I have not put a built-in critical section around NEW and DISPOSE.) 1.4 How to Write a Program (Lexical Issues) PASCAL programs can be written using the full ASCII character set, including lower case. Of course some characters (e.g. control characters) will be illegal except in character or string constants. Lines are ended by carriage-return/linefeed, form feed, or altmode (escape). Lower case letters are mapped into the equivalent upper case letters before being anaylzed by the compiler, except in string or character constants. However, lower case letters will always appear in any listings exactly as read in. Now we shall describe language elements which use special characters. 10 Comments are enclosed in { and }, (* and *), /* and */, or % and \. For example {This is an official comment} (*This is a comment*) %So is this\ (*And so \ is this *) The switches mentioned above as appearing in the compiler command line may also be set in the body of the program by directives. Such directives take precedence over any setting typed in the command string. These directives are comments which start with a $ sign and have the form (*$C+*) or %$C+\ Each switch has a corresponding letter (e.g. C represents /CHECK). A + after the letter indicates that the corresponding switch should be turned on, a - that it should be turned off. More than one switch setting can be given, separating them with a comma: (*T+,M-*) The letters used in the directives correspond to the switches in the following way: 11 A ARITHCHECK C CHECK D DEBUG H HEAP M MAIN L OBJECTLIST S STACK V VERSION Z ZERO The form for H and S is (*$H:400000B*), etc. The form for V is (*$V:2200000000B*), etc., i.e. the version number in octal. Note that setting or clearing C also sets or clears A, so order matters. To clear C, but leave A on, you should do something like {$C-,A+}. This is consistent with the overall approach wherein the default value of ARITHCHECK is the same as CHECK. Identifiers may be written using the underline character to improve readability, e.g.: NEW__NAME Strings are character sequences enclosed in single quotes, e.g.: 'This is a string' If a quote is to appear in the string it must be repeated, e.g.: 'Isn''t PASCAL fun?' Note that mapping of lower case to upper case is not done inside strings. 12 An integer is represented in octal form if it consists of octal digits followed by B. An integer is represented in hexadecimal form if it consists of a " followed by hexadecimal digits. The following representations have the same value: 63 77B "3F Several PASCAL operators have an alternate representation. The alternate form is provided for compatibility with older versions of PASCAL. The form of the operator shown in the left column should be used in new programs. operator alternate form explanation >= " greater or equal <= @ less or equal AND & logical and OR ! logical or NOT $ logical negation <> # not equal + OR,! set union * AND,& set intersection 13 2. Input/Output Input/Output is done with the standard procedures READ, READLN, WRITE, and WRITELN as described in the Revised Report on PASCAL [1,2]. 2.1 Standard Files In addition to the standard files INPUT and OUTPUT the standard file TTY is available in Tops-20 PASCAL. This file is used to communicate with the terminal. The standard files can be directly used to read or write without having to use the standard procedures RESET or REWRITE. Note that these files are logically declared in a block global to all of your code. Specifically, if you use external procedures, those procedures may also refer to INPUT, OUTPUT, and TTY, and the same files will be used as in the main program. As described in the Revised Report, the files INPUT and OUTPUT are openned for you automatically if you mention them in your PROGRAM statement. The file TTY does not need to be openned, since it is "hardwired" to the terminal. (Indeed mentioning TTY in the program statement is completely useless. Doing RESET or REWRITE on TTY is also almost completely useless, except that RESET can be used to establish lower to upper case conversion or to let you see end of line characters. However any file specification given in RESET will be ignored.) 2.2 File Declaration Files that you declare follow the normal scope rules. That is, they are local to the block in which they are declared. This means that a file F declared in the main program is a different file than a file F declared in a file of external procedures, or in a different block. To use the same file in an external procedure, you should pass it as a parameter to the procedure. (It must be passed by reference, i.e. declared with VAR in the procedure header.) 14 You have two opportunities to specify what external file name you want associated with a Pascal file variable (e.g. that you want INPUT to refer to "TTY:"). One is by listing the file variable in the PROGRAM statement. This has been described above. The other is by supplying a file name as a string when you use RESET or REWRITE. If you do not supply a file name in one of these ways, the file is considered "internal". That is, Pascal will choose a filename for you, and will do its best to see to it that you never see the file. When you exit from the block in which the file variable was declared, Pascal will delete the file. Such files are useful for temporary working storage, but obviously should not be used for the major input and output of the program. The syntax of the PROGRAM statement has been extended to allow you to declare some extra attributes of the file. These allow the initial file name dialog to be made slightly more intelligent than otherwise. The attributes are specified by using a colon after the file name in the list. For example PROGRAM TEST(INPUT:-*/,OUTPUT:+); The character - implies that this is an input file. When the user is prompted for the file name, an error message will be typed if it does not exist, and he will be asked to try again. The character + implies that this is an output file. This means that if the file exists and the user defaults the version number, a new version will be used. The character * allows wildcards in the file specification. If this extra syntax is not used, no assumption is made about whether the file is input or output, and the most recent generation is used for any existing file. The / is effective only for the file INPUT. It specifies that the compiler-generated RESET should be done in interactive mode, i.e. that the initial implicit GET should not be done. This is useful for files that may be associated with terminals. It eliminates the initial [INPUT, end with ^Z: ] that would otherwise be generated by the Pascal I/O system. 15 2.3 RESET and REWRITE (simple form) Except for the standard files, a file must be "opened" with the standard procedure RESET when it is to be used for reading. It must be "opened" with the standard procedure REWRITE when it is to be used for writing. RESET and REWRITE may have up to 6 parameters in Tops-20 PASCAL. However, most users will need at most 2 of them, so the others are deferred until section 3.8. RESET (,) Only the first parameter is required. If is specified, it must be of type PACKED ARRAY of CHAR. Any length is acceptable, and string constants may also be used. The parameter is expected to be the usual Tops-20 file spec. A GTJFN is done on it. Simple programs need not use this parameter, since they will get the file name via the starting dialog. If you omit this parameter, the jfn currently associated with this file will be used again. Usually this means that the same file (and version) used before will be used. The openning dialog does a GTJFN, so in practice you usually end up with the file specified in that dialog. If the file was not listed in the PROGRAM statement, and you do not specify a file name some time when you open the file, it will be considered "internal", as described above. To omit the file spec parameter when further parameters are specified, use a null string, i.e. ''. In the following example REWRITE is used to give the file OUTPUT the actual file name TEST.LST, with protection 775252 Example: REWRITE(OUTPUT,'TEST.LST;P775252') Note that RESET and REWRITE can fail. The most common cause is something wrong with your file spec., but various other problems with the program or hardware can also cause failure. Unless you have specified user error handling (see below), you will get the official 16 error message returned by the monitor. 2.4 Formatted Output Parameters of the standard procedure WRITE (and WRITELN) may be followed by a "format specification". A parameter with format has one of the following forms: X : E1 X : E1 : E2 X : E1 : O X : E1 : H E1 is called the field width. It must be an expression of type INTEGER yielding a non-negative value. If no format is given then the default value for E1 is for type INTEGER 12 BOOLEAN 6 CHAR 1 REAL 16 STRING length of string Blanks precede the value to be printed if the field width is larger than necessary to print the value. Depending on the type involved, the following is printed if the field width is smaller than necessary to print the value: 17 INTEGER(normal) field width increased to fit INTEGER(octal) least significant digits INTEGER(hex) least significant digits REAL field width increased to fit BOOLEAN field width increased to fit STRING leftmost characters A maximum of 7 significant digits will be printed for real numbers. Rounding is done at the seventh digit (or the rightmost digit, if the format does not allow a full seven digits to be displayed). Because of the automatic expansion of formats for normal integers and reals, a field width of zero is a convenient way to get a free format output. The minimal field width for values of type REAL is 9. The representation used for a field width of 9 is b-d.dE+dd, where b is a blank, - a minus sign or blank, d a digit, and + a plus or minus sign. As the field width is increased, more digits are used after the period, until a maximum of 6 such digits is used. After than point, any increased field width is used for leading blanks. Example: WRITELN('STR':4, 'STR', 'STR':2, -12.0:10); WRITELN(15:9, TRUE, FALSE:4, 'X':3); The following character sequence will be printed (colons represent blanks): :STRSTRST -1.20E+01 :::::::15::truefalse::X (Note that the field width for FALSE has been expanded in order to fit in the output.) A value of type REAL can be printed as a fixed point number if the format with expression E2 is used. E2 must be of type INTEGER and yield a non-negative value. It specifies the number of digits following the decimal point. Exactly E2 digits will always be printed 18 after the point. The minimal field width for this format is E2 + D + S + 2, where D represents the number of digits in front of the decimal place, and S is 1 if the number is negative and 0 otherwise. The extra 2 places are for the decimal point and a leading blank. There is always at least one leading blank, as required by the Revised Report. Extra field width will be used for leading blanks. Example: WRITELN(1.23:5:2, 1.23:4:1, 1.23:6:0); WRITELN(1.23:4:3, 123456123456:0:0); The following character sequence will be printed (colons represent blanks): :1.23:1.2::::1. :1.230:123456100000. The :1.230 is a result of automatic format expansion, since the specified 4 spaces was not enough. The 123456100000 shows that numbers will be rounded after 7 significant digits. A value of type INTEGER can be printed in octal representation if the format with letter O is used. The octal representation consists of 12 digits. If the field width is smaller than 12, the rightmost digits are used to fill the field width. If the field width is larger than 12, the appropriate number of blanks preceded the digits. Example: WRITE(12345B:2:O, 12345B:6:O, 12345B:15:O); The following character sequence will be printed (colons represent blanks): 19 45012345:::000000012345 A value of type INTEGER can also be printed in hexadecimal representation if the format with letter H is used. The hexadecimal representation consists of 9 digits. Using the format with letter H, the following character sequence will be printed for the example above (colons indicate blanks): E50014E5::::::0000014E5 2.5 Reading characters In official Pascal one cannot use READ or READLN to read into arrays of CHAR. Thus one sees many programs full of many loops reading characters into arrays of CHAR, cluttering up essentially simple algorithms. I have implemented READ into arrays and packed arrays of CHAR, with capabilities similar to SAIL's very fine string input routines. An example of the full syntax is read(input,array1:howmany:[' ',':']) This will read characters into the array array1 until one of three things happens: - One of the characters mentioned in the "break set" (in this case blank or colon) is read. This character (the "break character") is not put into the array. You can find it by looking at INPUT^, since this always contains the next character that will be read by READ. Howmany (which can be any integer variable) is set to the number of characters actually put into the array. - End of line is reached in the input. Again, howmany is set 20 to the number of characters put into the array. You can test for this outcome by looking at EOLN(INPUT). - The array is filled. In this case, INPUT^ is the character that would have overflowed the array. Howmany is set to one more than the size of the array, in order to allow you to detect this case uniquely. If filling of the array is terminated by a break character or end of line, the rest of the array is cleared to blanks. There is some problem caused by the fact that the implementation used for sets of characters does not allow all 128 ASCII character codes. To avoid this problem, lower case characters in the input are treated as break characters if the corresponding upper case character is in the break set. And all control characters are treated as break characters if any control character is specified as a member of the break set. (Tab is an exception - it is treated as a separate character.) Note that these limitations are actually limitations in set implementation. They have nothing specific to do with I/O. For example, if a lower case character is mentioned as a member of a set, its upper case equivalent is actually put in. Thus if you use ['a'] or ['A'] as a break set, you get exactly the same results: Both upper and lower case A are treated as break characters. The break set can be omitted, in which case input breaks only on end of line or when the array fills up. The integer variable can also be omitted, in which case the count is not given to the user. Thus the actual syntax permitted is read([:[:]]) The user is cautioned not to confuse this syntax with the field width specification for output: READ(X:I) does not specify a field width of I. Rather I is set after the input is done to tell how many characters were actually read. A number of users have been confused as to how to read from the terminal. The following would seem to be appropriate: 21 write(tty,'Please enter I and J as integers:'); readln(tty,i,j); However if you do all of your input that way, you will discover that in each case the system will go into terminal input wait trying to read the data BEFORE the question is printed. The reason has to do with the exact definition of readln: Readln reads characters until it comes to the first character AFTER the next end of line. When used with a terminal, this means that readln throws away anything else on the current line and waits for you to type the next one. Readln(tty, i,j) is equivalent to read(tty,i,j); readln(tty). So it reads the data and then asks for the next line, before the next question can be asked. The correct sequence of statements is write(tty,'Please enter I and J as integers:'); readln(tty); read(tty,i,j); Note that the first time this sequence is used in your program, the readln will pass the null that is automatically inserted before your first real character (see 2.6). For files other than TTY, you will have to declare them as interactive in order to get the same effect. 2.6 The Standard Files There are three files which may be initialized by PASCAL for the user automatically. These are INPUT, OUTPUT, and TTY. If you list it in the program statement, INPUT is initialized by an implicit RESET. If you list it, OUTPUT is initialized by an implicit REWRITE. TTY is always initialized on the user's terminal. For most purposes one may assume that TTY is both RESET and REWRITTEN, i.e. that it can be used for both read and write operations. As in standard PASCAL, the default file for those standard procedures that read is INPUT, and for 22 those that write, OUTPUT. If I/O is to be done on the file TTY, it must be explicitly mentioned as the first argument to READ, WRITE, etc. Of course the files INPUT and OUTPUT may be assigned to the terminal by specifying TTY: as the file spec in the initial dialog. In general TTY can be used with any of the read or write procedures. Actually, however, this is somewhat of an illusion. Internally, the file TTY is only usable for input, and the file TTYOUTPUT is used for output. The user need not normally be aware of this, as all mentions of TTY in output procedures are automatically transformed into TTYOUTPUT. However, for obvious reasons, such mapping cannot be done with buffer variables. Thus should one wish to work with the buffer directly, TTYOUTPUT^ should be used for output. TTYOUTPUT must also be used explicitly with PUT and REWRITE. Note however that TTY is directly connected with the user's terminal via RDTTY and PBOUT. REWRITE and RESET cannot be used to alter this. In standard PASCAL, RESET(file) does an implicit GET(file), so that file^ contains the first character of the file immediately after the RESET is done. This is fine for disk files, but for a terminal it makes things difficult. The problem is that RESET(TTY) is done automatically at the beginning of the program, so the program would go into TTY input wait before you had a chance to prompt the user for input. To solve such problems, many implementations allow you to specify a file as interactive. Such a specification keeps RESET from doing the implicit GET. In this implementation, TTY is always interactive. Other files can be made interactive by specifying a non-zero third argument in the RESET. (The distinction is irrelevant for REWRITE, and the third argument is ignored for REWRITE.) For an interactive file, file^ will not contain anything useful until you do an explicit GET. To indicate this fact, the system automatically sets EOLN(file) true after RESET. Thus any program that checks for EOLN and does READLN if it is true will work correctly. (This is done automatically by READ with numerical and Boolean arguments.) 2.7 Character Processing Any character except null (0) can be read or written by a PASCAL 23 program. In the normal case, end of line characters appear in the buffer (e.g. INPUT^) as blanks. This is required by the specifications of the Pascal language. To tell whether the file is currently positioned at an end of line, EOLN should be used. When EOLN is true, the buffer should contain some end of line character (although what actually appears there is a blank). To get to the first character on the next line do READLN. (If the next line is empty, of course EOLN will be true again.) This is done by the system routine READ when it is looking for numerical input. Note that carriage return, line feed, form feed, altmode, and control-Z are considered to be end of line characters. However, if the end of line was a carriage return, the carriage return and everything up to the next end of line (typically a line feed) is considered a single character. If it is necessary to know which end of line character actually appeared, the user can RESET the file in a special mode. When this mode is used, the end of line character appears in the buffer unchanged. You can still tell whether the buffer is an end of line character by using EOLN (indeed this is the recommended practice). In this mode, carriage return is seen as a single character, separate from the line feed. However READLN will still treat a carriage return and line feed as a single end of line. To be precise, READLN will skip to the next line feed, form feed, altmode, or control-Z before returning. To open a file in the mode where you see the end of line character, specify /E in the options string used in the RESET (see section 3.8), or in the case of INPUT being implicitly openned by the PROGRAM statement, specify INPUT:#. You may request the special file TTY to be openned in this mode by listing TTY:# in your program statement. Control-Z is also considered the end of file character for normal files openned on terminals (but not the special file TTY, which has no end of file condition). Terminal I/O is done in such a way that control does not return to the program until ^G, ^L, ^Z, , , or is typed. This allows the normal editing characters ^U, ^R, , etc., to be used. This is true with normal files open on terminals as well as the file TTY. It is possible to cause all lower case letters to be turned into the equivalent upper case when they are read by your program. To set up 24 this process, specify /U in the options string used in the reset. (See section 3.8.) 25 3. Extensions to PASCAL We have tried (somewhat unsuccessfully) to avoid the usual temptation of Pascal implementors to add lots of features to the language. The actual language extensions are limited to those built into the base version as it came from Hamburg. However we have added many library procedures, mostly to allow a wider variety of I/O and better access to the system. Only the most common of these have actually been put into the compiler as predeclared procedures. So several of the procedures described below are noted as being external. This means that you must include an explicit procedure declaration for them, with the token EXTERN replacing the procedure body. For your convenience, these EXTERN declarations have been collected into a file called EXTERN.PAS. It may be included in your program by using the statement INCLUDE 'EXTERN.PAS'; immediately after your PROGRAM statement. should be replaced by whatever directory Pascal files usually reside on. ( at Sri-KL, S: at Rutgers) 3.1 Input/Output to strings It is often convenient to be able to use the number-scanning abilities of READ to process a string of characters in an array of CHAR. Similarly, it may be useful to use the formatting capabilities of WRITE to make up a string of characters. To allow these operations, this implementation provides a facility to treat a packed array of CHAR as if it were a file, allowing READ from it and WRITE to it. This facility is equivalent to the REREAD and REWRITE functions present in many implementations of FORTRAN. To make use of this, you must use a file that has been declared FILE OF CHAR. Rather than using RESET or REWRITE to initialize I/O, you use STRSET or STRWRITE instead. These associate a string with the 26 file and set the internal file pointer to the beginning of the string (in the simplest case). A typical call would be STRSET(FILE1, MYARRAY). After that call is issued FILE1 can be used with READ, etc., and will take successive characters out of the array MYARRAY. Similarly, one might do STRWRITE(FILE2,YOURARRAY), and then use WRITE(FILE2,...) to write things into YOURARRAY. Note that as with a RESET, an implicit GET is done as part of the STRSET. Thus immediately after the STRSET, the first character of the string is in the file buffer. It is possible to start I/O at a location other than the beginning of the array. To do so, use a third argument, which is the index of the first element to be transferred. E.g. STRSET(FILE1,MYARRAY,5) means that the GET will retrieve element 5 from MYARRAY. (This is MYARRAY[5]. It is not necessarily the fifth element, since the index might be -20..6 or something.) There is a procedure to see where you currently are in the string. It is GETINDEX(file,variable). Variable is set to the current index into the array. This is the index of the thing that will be read by the next GET (or written by the next PUT). Note that no runtime error messages will ever result from string I/O. Should you run over the end of the string, PASCAL will simply set EOF (or clear it if you are doing output). It will also set EOF if you read an illegal format number. (GETINDEX will allow you to discriminate these two cases, if you care.) There is also a fourth optional argument to STRSET and STRWRITE. This sets a limit on how much of the array will be used. It thus gives you the effect of the substring operator in PL/I. For example, STRWRITE(F1,AR1,3,6) will make it possible to change characters 3 to 6 inclusive. If absent, the fourth argument defaults to the last location in the array. Note that arrays of types other than CHAR can be used. They must be packed arrays, however. (In order for an array to be considered packed, the elements must take up a half word or less. You can declare an array PACKED ARRAY[..]OF INTEGER, but it is not really considered packed.) Of course the file and the array must have the same underlying type. (This is checked.) 27 Beware that it is possible to set a file to an array, and then exit the block in which the array is defined. The file is then pointing out into nowhere. This is not currently detected. 3.2 Monitor calls For those daring souls who want to have access to all the facilities of the machine, it is possible to insert JSYS's into your program. Although this is syntactically simply a call to a predeclared runtime, it compiles code in line. There are so many options that this is best thought of as a sort of macro. It has the following syntax: jsys(jsysnum, extralocs, return; arg1, arg2, ...; result1, result2, ...) The only required argument is jsysnum. This must be an integer constant, and is the number of the jsys to be compiled. When arguments are left out, the associated commas should also be omitted. If an entire group of arguments is left out, the semicolons should also be omitted, except that an extra semicolon is necessary when there are no argi's and there are resulti's, for example jsys(3;;i); Extralocs should be non-zero to insert dummy instructions after the jsys, in case it skips. In the simplest case, the specified number of jfcl's are inserted. If extralocs is negative, the first inserted instruction is an erjmp, and abs(extralocs) instructions (counting the erjmp) are inserted. Return allows you to know how the jsys returned. It makes sense only with non-zero extralocs. Return must be an integer variable, and will be set to a number indicating which return was taken, 1 for non-skip, 2 for one skip, etc. If an erjmp is activated, return will be set to abs(extralocs)+1, i.e. one larger than the largest value normally possible. Arg1 ... are expressions which will be put into registers 1 up. There are a number of special cases: - Simple variables and expressions have their values loaded. - Sets have the first word only loaded. (This lets you use 28 the first 36 entries of a set to set arbitrary bits.) - Files have their jfn loaded, with index bits in the left half in the normal case. (So use 0:file if you just want the jfn.) - Packed arrays have a byte pointer to the first element loaded. - Complex variables (records, arrays, etc.) have their address loaded. - To specify half words, use a colon between the left and right half. E.g. to print the most recent error on the terminal you might use jsys(11B,2;101B,400000B:-1,0) The 400000B:-1 specifies that register 2 will have 400000 octal in the left half and -1 in the right half. All the special cases defined above may be used this way, though only the low order half word of the values will be used, obviously. Result1 ... are variables specifying where the contents of registers 1 up should be put after the jsys is executed. These are treated as above, except that "complex variables" are not allowed. (It is not in general clear what they would mean.) The code may not work correctly if you try to return registers above 4. (5 and up are used as temporaries for handling certain obscure cases.) This monstrosity is compiled inline. It is best thought of as a macro, since the code generated depends quite heavily on which arguments are supplied, and to a certain extent on their values and types. The code has a few inelegancies, but is probably faster than calling even a well-coded assembly language subroutine. If you are going to do I/O with jsys on a pascal file, you should probably be sure that the file was openned in byte mode, unless you really know what you are doing. Needless to say, this construct is "escape from Pascal". I.e. there is no protection against doing something with a jsys that will clobber your program or the Pascal runtimes. 29 3.3 INITPROCEDURE Variables of type scalar, subrange, pointer, array or record declared in the main program may be initialized by an INITPROCEDURE. The body of an INITPROCEDURE contains only assignment statements. Indices as well as the assigned values must be constants. Assignment to components of packed structures is possible if the components occupy a full word. The syntax of an INITPROCEDUE is as follows (the parts enclosed in [ and ] may appear 0 or more times): ::= INITPROCEDURE ; BEGIN END; ::= [ ] The must follow the variable declaration part and precede the procedure declaration part of the main program. Note that INITPROCEDURES do not compile into code. Instead they put the values specified into appropriate places in the .REL file, so that the variables are initialized by loading the program. This means that you should not attempt to call an INITPROCEDURE. It also means that if you restart a program (e.g. by ^C-START), the INITPROCEDURES will not be redone. We recommend very strongly that INITPROCEDURES only be used for constant data. 3.4 Extended CASE Statement The CASE statement may be extended with the case OTHERS which then appears as the last case in the CASE statement. This case will be executed if the expression of the CASE statement does not evaluate to one of the case labels. 30 In the following example it is assumed that the variable X is of type CHAR: CASE X OF 'A' : WRITELN('VALUE IS A'); 'B' : WRITELN('VALUE IS B'); OTHERS : WRITELN('VALUE IS NEITHER A NOR B') END %CASE STATEMENT\ 3.5 LOOP Statement The LOOP statement is an additional control statement which combines the advantages of the WHILE and the REPEAT statement. The LOOP statement has the following syntax: ::= LOOP [; ] EXIT IF ; [; ] END The expression must result in a Boolean value. Note that there must be exactly one EXIT IF in each LOOP. 3.6 CLOSE, RCLOSE, and DISMISS It is often desirable to close a file during execution of a program. This may be needed to make it available to other programs (under certain circumstances), or simply to remove the possibility of accidentally changing it if you know you are not going to use it 31 again. Thus the procedure CLOSE is available. The normal call is CLOSE(file) This closes the file, but does not release the jfn. Thus any future RESET, REWRITE, etc., of this file will use the same jfn (i.e. name, etc.) as the one just closed, unless a new file spec is given explicitly. An optional second argument may be given, CLOSE(file, bits). It is an integer, which will be used for accumulator 2 in the CLOSF jsys. (Only experts will need this. Note that the bit CO.NRJ is set automatically to keep the jfn from being released.) RCLOSE works just like CLOSE, but releases the jfn. It is not obvious what this is good for. It also takes an optional second parameter. DISMISS closes the file, but rather than finishing out the normal writing, it aborts the creation of the file. If the file is on disk, it is expunged. If it is on tape, the normal writing of EOF's, etc., is skipped. 3.7 MARK and RELEASE MARK and RELEASE can be used to organize the heap like a stack. Both have one parameter which must be of type INTEGER. MARK(X) assigns to X the current top of the heap. The value of X should not be altered until the corresponding RELEASE(X). RELEASE(X) sets the top of the heap to X. This releases all the items which were created by NEW since the corresponding MARK(X). Use of release is dangerous if any of the records released contains a file. DISPOSE of a record containing a file will correctly close the file. However RELEASE is a bit more wholesale, and files will not get closed. Note that MARK and RELEASE are probably not useful with programs that use DISPOSE, since DISPOSE invokes a dynamic memory manager that does 32 not allocate the heap as a simple stack. 3.8 I/O facilities for wizards only PASCAL has the ability to use the full I/O capabilities of Tops-20. This includes wildcards, file updating, etc. Before we discuss the facilities available for I/O, it will be helpful for the reader to understand the relation between Pascal files and jfn's (the "handles" used to refer to physical files). A Pascal file variable secretly is associated with a jfn. The jfn is assigned at startup if the file is listed in the program statement. It may be changed by supplying a new file spec in RESET, etc., or by doing explicit jsys's. If RESET, etc., is done without specifying any file spec, the jfn currently associated with that file is used (if there is one - otherwise a default file spec is used to generate a jfn, and the file is marked as internal). This means that sequences of reset, rewrite, etc., all refer to the same jfn, i.e. to the same version of the file, unless a new file spec is given. A few functions change the jfn associated with the file. RENAME replaces the old jfn with a jfn associated with the new name. NEXTFILE releases the old jfn completely when you come to the end of the files associated with the file spec. There is nothing I can do about this, because of the way the monitor is built. 3.8.1 Extra arguments to RESET, etc. Most of the options available for I/O are specified in arguments to RESET and REWRITE. The full form of these procedures includes the following arguments: RESET (,,, ,,) 33 You may omit trailing parameters. (They are taken as 0). As a convenience for typing bits and flags, a set may be used as the argument for , , or . The first word of the set is passed. Fortunately, the Pascal representation of a set of integer is such that [n] produces a word with bit n turned on. E.g. to specify bits 3 and 5 (1B3!1B5 in Macro), one can use [3,5]. If a file spec is given, any existing jfn will be released, and a new one gotten for that file spec. The file spec may be followed by :*@. The *@ are each optional, and may appear in either order. * indicates that wildcards should be allowed in the spec. @ indicates that the runtimes should do a gtjfn from the terminal. Confirmation and messages will not be turned on by default, but you may set the bits gj%cfm and gj%msg yourself in the gtjfn bits. + and - are not needed as in the program statement, since the runtimes know whether the file is input or output. Effectively + is used for rewrite and - for the others. To omit the file spec, type ''. (To get a spec from the tty:, you normally supply '':@ ) The form shown above is the old, full form of the RESET. Because no one (including me) could remember which bit is which, a new form is provided which allows you to set the most useful bits by the use of switches. To do this, pass a string for the third parameter, e.g. RESET(F,'A.B','/I/E') Here are the meaning of the switches. For details on what they do, you will have to look below where the bits that they set are described. Note that you can mix the two notations, i.e. use a string for the third parameter and then go on to set bits in the later parameters. The bits set in the two ways are or'ed together. /B:nn Byte size specification. The number specified goes into the byte size field of the OPENF word. It is mainly useful for handling industry-compatible magtape, wherein 8 bit bytes are useful. For details about the meaning of the byte size, see section 3.8.4. /D Data transmission errors will be handled by the user. See the section below on error handling (section 34 3.8.3). A data transmission error is usually a physical problem of some sort. See /F for problems with the format of the data. /E End of line characters will be visible to the program. Normally Pascal programs put a blank in the input buffer at the end of line. If this flag is set, the actual end of line character appears in the buffer. Normally a single GET will read past both a carriage return and a line feed, since they are a single line terminator. But if /E is set, the carriage return and the line feed will be treated as separate characters, although READLN will still skip them both. /F Format errors in the data will be handled by the user. See the section below on error handling (section 3.8.3). A format error occurs when the data is readable, but is not what READ wants. E.g. when trying to read a number, a letter is found. /I Interactive file. The meaning of this has been discussed above. It keeps the system from reading the first component in the file, as it normally does whenever a file is openned. /M:nn Mode. This allows you to specify the internal software mode that will be used in processing the file. See section 3.8.4. /O Open errors will be handled by the user. See the section below on error handling (section 3.8.3). An open error is an error that occurs during the RESET, REWRITE, etc. Most commonly it is when the specified file is not present or a protection problem (e.g. you aren't allowed to read the file). /U Upper case the file. All lower case letters will be turned into the equivalent upper case. Only letters are affected. The normal user may skip to the end of this section now. The other parameters are intended mainly for use by hackers. 35 The 3rd parameter suppresses implicit Gets, as for Tops-10. It is not used for rewrite, as protection is put in the file spec following usual Tops-20 conventions. Normally you should specify 0 for the gtjfn and openf bits. The runtimes will supply those bits needed to carry out the operation. However if you do specify non-zero bits, they will be xor'ed into the bits supplied by the runtimes, except that certain bits will be ignored (e.g. those that control whether it a long or short form gtfjn). The bits supplied by default are: for reset, update, and append: gj%old, gj%flg, gj%sht; for rewrite: gj%fou, gj%flg, gj%sht. Those bits that may not be changed by the user are gj%flg, gj%sht, gj%jfn, gj%ofg, and gj%xtn. Currently the flag word is used for the following: left half - buffer or record size, in bytes. See Section 3.8.4 bit 1 - map lower case to upper bits 16 - control of error processing. Error processing will be explained in the next section. bits 7700 - number of buffers or pages for buffering. See section 3.8.4 bits 770000 - I/O method. This allows you to override Pascal's usual method of handling I/O. See Section 3.8.4 3.8.2 Labelled tape processing Tops-20 Pascal has fairly powerful facilities for dealing with labelled tapes. These facilities use Tops-20's special labelled tape support, so they are not yet available for Tops-10. To read a labelled tape, you normally do not need to do anything special. However when you want to write a labelled tape, you often may want to specify exactly how the file is to be written. To do this, you include "attributes" as part of the file name. Here is a typical file name that uses attributes to control the format of a file on tape: MT0:DATA;FORMAT:F;RECORD:80;BLOCK:8000 36 This entire string is considered to be the file name. You can type such a string when you are asked about the files in the program statement. Or you can supply such a string as the second argument to a REWRITE statement. This particular string asks for a file called DATA to be written in fixed format, with records of length 80, blocked 100 per block. Pascal will fill lines that are shorter than 80 characters with blanks. The record format is described by ;FORMAT:x, where x is a single letter. The following formats are supported by Pascal: U (undefined) This is the default for output files. If this is what you want, you do not need to use any attributes at all. "MT0:" alone would be sufficient. Pascal assumes that a file in format U has the same structure as a disk file. That is, end of lines are denoted by carriage returns, etc. Pascal will ignore physical record boundaries on the tape. If you do not do anything special, such files can be copied back and forth between disk and tape using a simple COPY command in the EXEC. You might want to specify a block size in order to make more efficient use of the tape. E.g. "MT0:;BLOCK:5120". Tapes written in format U will probably not be readable on computers other than a DECsystem-10 or DECSYSTEM-20. D (variable) This format is useful for text files that you want to be able to move to other computers. This format uses special record headers to show where lines begin and end. Most other computers understand these headers, but do not understand the normal DEC convention of using carriage returns to end a line. Also, tapes are coded one character per tape frame, which is what other systems expect. Unless you really know what you are doing, you will only be able to use D format for text files (files declared TEXT or FILE OF CHAR). To use this mode, you should specify something like "MT0:;FORMAT:D;BLOCK:5000". The block size should be chosen large enough to make reasonably efficient use of tape, but not to waste memory. F (fixed) This format is also usable to communicate with other computers. In it there is nothing on the tape to show 37 where lines end. However all lines are same length. Thus system can find the end of a line by counting characters. These tapes are also coded one character per tape frame. The example above showed how to specify format F. You should specify both a block size and a record size. Pascal will fill out all lines to match the record size, by putting blanks at the end of each line that would be too short. The system will put as many records into one physical tape block as it can fit there. The block size must be an even multiple of the record size. Again, the block size should be big enough not to waste tape. Unless you are an expert you will only be able to use this mode for text files also. S (spanned) This is a somewhat unusual mode. It is very similar to mode D. However the record headers are written in such a way that records can cross block boundaries. This mode makes somewhat more efficient use of tape, but is more complex to process. Many other computers do not support it. When you are reading a labelled tape, Pascal can tell the structure of the tape from the label. Thus you should not have to specify any attributes for an input file. In addition to the tape format, which is describe by the FORMAT attribute, there is also a distinction between "stream I/O" and "record-oriented I/O". Stream I/O is the normal style on Tops-20. With it, line and record boundaries are indicated in the data, e.g. with carriage returns. The tape records are completely ignored by the program. Record-oriented I/O is used when the tape records should match the line or record structure of the file. In record-oriented I/O, there would usually be no carriage returns in the file, since the end of line would be indicated by the record structure. By default, Pascal uses stream I/O for format U and record-oriented I/O for the other formats. You can override this default by specifing '/M:7' in the option string for record-oriented I/O and '/M:5' or '/M:6' for stream-oriented I/O. See section 3.8.4 for the difference between these two modes. 38 3.8.3 I/O Error processing Errors may occur at three different times. Each of the resulting three error types has a bit associated with it in the flag parameter to RESET, etc. The types (and associated bits) are 2B errors during actual I/O operations (get, put, ...) 4B errors during number conversion (read, write) 10B errors during file openning (reset, rewrite, ...) When an error occurs, the appropriate bit in the flag word is checked. If it is 0, a fatal error message is printed, using the erstr string. If it is 1, the error number (from geter) is stored where you can get it, EOF (and EOL) is set to show you that you can no longer read or write, and the program continues. Normally future I/O operations become no-ops until you recover the error. The runtimes are constructed so that the high-level read routines (READ and READLN) return immediately with no effect when EOF is set. Other routines proceed as usual, but usually the monitor sees to it that they have no effect. The moral is, if you set one of the magic bits, you had better check EOF (at least, probably also ERSTAT) now and then. Note that if you set bit 10B (open errors) but not bit 2 (I/O errors), the open error will generate a fatal I/O error at the first I/O operation, since the monitor refuses to do I/O with a file that has not been successfully openned! The EXTERN function ERSTAT(file) [section 3.16] may be used to look at the most recent monitor error code for a given file. PASCAL does not attempt to do any error recovery beyond printing error messages and saving error numbers. It is assumed that users who specify user error processing do not want us interfering. In most cases I/O operations in which an error occurs are no-ops. Thus if a get fails, a succeeding get will try to read the same character. Of course in most cases this will again fail, so the user will want to advance the file (SETPOS [section 3.12] for disk, some mtopr function using JSYS for tape). Note however that in the case of record I/O, some of the record may have been read successfully. In this case GETX [section 3.8.6] will attempt to reread the tail of the record (and again will usually fail). I.e. reading will begin with the byte for 39 which it blew up before. The monitor has a restriction that no I/O is possible when an error bit is set in the file status word. Thus in general one will want to clear this error indication before doing any retry or other recovery. Thus the EXTERN routine CLREOF(file) [section 3.16] both clears the internal Pascal EOF indication and does a STSTS jsys to clear the error and EOF flags in the monitor. Note that if the error is a real end of file, it can be cleared by doing a random access SETPOS [section 3.12] to a byte number within the file. CLREOF is not necessary to clear an end of file condition before random access. (SETPOS will not clear an error other than end of file, however.) 3.8.4 I/O implementation In order to get both efficiency and convenient operation, Pascal has to have a set of device-dependent I/O routines. Unless you override it, Pascal will use a routine that gives the best performance for the device type that each file is on. However, you can explicitly specify which routine you want used for a given file by putting the number of the routine in bits 770000B of the flag word in Reset, etc. If these bits are 0, Pascal will use its own choice. In the sections below, we describe each of the routines, by number. In all cases, a file has associated with it a "logical byte size". If the user specifies a byte size in the openf flag word, that size is used. Otherwise, 7 bits is used for text files, and 36 bits for record files. In most cases the file is openned using the logical byte size. However in a few cases full word I/O is done and the logical byte size is simulated by getting bytes of the specified size from Pascal's I/O buffer. Note that several of these routines use buffering. The overhead for a monitor call is sufficiently high that it is very expensive to get each byte separately. So buffering involves getting a large number from the monitor at once and then giving them to the user one by one. Or for output, collecting bytes in a buffer and sending them to the monitor at once when the buffer is full. While it improves efficiency, this method introduces two kinds of timing problems. 40 First, sometimes the user will want to synchronize logical and physical I/O. For example, if you rewind the tape, you will want to be sure that any bytes built up in the buffer are put out before the rewind takes place. Similarly, after a rewind, you will want to be sure that any bytes read come from the beginning of the tape, rather than being left over in the buffer from before you did the rewind. Thus there are two routines for clearing the buffer: BREAK(file) - forces out any bytes in the buffer for output BREAKIN(file) - clears any leftover bytes in the buffer for input. Afterwards, does an implicit GET, unless it is suppressed by BREAKIN(file,true) The details about these routines depend upon the exact method used for buffering, so they are described in detail below for each of the buffered modes. Note that BREAK and BREAKIN should not be used with SETPOS (random access movement [section 3.12]), as SETPOS does its own synchronization. The other synchronization problem involves errors. In byte by byte I/O you get an error exactly when it happens. Then there is no timing problem - the routines simply set EOF or print an error message immediately. However in buffered I/O, errors will usually happen when the routines are in the middle of filling a buffer. The error cannot be acted on immediately, because some valid data has been gotten. So the end of file, or other error, is delayed until all of the valid bytes have been given to the user. The problem is that BREAKIN and SETPOS (random access positioning[section 3.12]) essentially throw away the remaining bytes in the buffer. Thus they have to cause any saved error to be triggered immediately. (It would be misleading to wait and give an old error when you tried read a completely different file after a rewind. But you have to be told about the error, since further processing is impossible with some devices until the error is cleared.) Thus you should not be shocked to find that BREAKIN and SETPOS, which are essentially bookkeeping operations, can cause I/O errors to be triggered and set EOF. Certain I/O methods allow you to specify the size buffer you want used. This will be mentioned in the description below if it is applicable. The specification takes bits 7700B in the flag word. This argument is ignored by methods that do not implement it. Details are given under the descriptions of those methods that implement this 41 option. (It is ignored for other methods.) At the moment it is implemented only for Pmap'ed mode. Note that this specification affects only the efficiency of your program. 3.8.4.1 Byte mode[1] This routine does bin and bout jsys's for text I/O, and sin and sout jsys's for record I/O. No interal buffering or other processing is done: each GET of a character is a BIN, etc. This avoid the various timing problems mentioned, but is unblievably slow for text files and small records. For large records it may not be too bad. This method is used only for devices for which Pascal can't figure out anything better, i.e. all devices for which no special routine is listed below. You should specify this mode explicitly whenever you wish to do your own I/O jsys's intermixed with Pascal I/O, unless you are prepared to give some care to transition between Pascal I/O and your own. (I.e. BREAK, BREAKIN, etc.) With this routine, the file is openned using the logical byte size and all operations utilize byte pointers of that size. Break and breakin are meaningless. 3.8.4.2 Pmap'ed mode[2] This routine attempts to handle disk I/O is the optimal way. Pages from the file are Pmap'ed directly into a buffer within Pascal. This makes performance particularly good for random access updating, but is always the best way to handle the disk. When it can, Pascal will always use this routine for disk files. However it will not be able to do so when you are writing or appending to a file to which you do not have read/write access. (In that case buffered mode, by bytes, is used.) This routine will work only for disk. Should you wish to do some jsys of your own with a file open in pmap mode, you may want to do BREAK(file) or BREAKIN(file). Break will unmap the file from the buffer, and would normally be used after doing Pascal I/O to allow you to play with the file more freely. Breakin 42 clears Pascal's internal entries about the file, and should be done before doing more Pascal I/O. (Breakin also returns you to the beginning of the file, so be careful.) Note that these are somewhat odd definitions of BREAK and BREAKIN. In this mode, the file is openned using the logical byte size. However the pmap jsys, which this uses, ignores the byte size, giving whole words. The logical byte size is simulated by the routines that remove and add bytes in the buffer. The effiency of I/O is affected remarkably by the size of the buffer, i.e. the number of pages mapped at one time. By default, PASCAL will use 4 pages. We believe that almost all users will be satisfied with this default. However you can specify the buffer size yourself, using the field 7700B in the flag argument to RESET, REWRITE, etc. What goes in this field is the number of pages used for the buffer. For example to open INPUT with 16 pages of buffering, you might use RESET(INPUT,'',0,0,0,16*100B). Multiplying by 100B is a convenient way to move the value into this field. If you are doing a lot of random access it might make sense to specify 1 or 2 pages, and if you are doing sequential I/O with very large record sizes, it might make sense to specify up to 16 pages. (Sizes over 16 pages do not improve performance noticably.) There is an absolute maximum of 36 pages. If you declare a buffer larger than 36 pages, 36 will be used. Note that very large buffer sizes only make sense in very special cases, usually involving binary files with very large records. Programs that process text files will probably not benefit from buffer sizes larger than the default. There is enough CPU time involved in handling individual characters that such is program will never be heavily I/O bound. Programs that do only random access I/O may perform slightly better with 1 or 2 page buffers, since preloading a bigger buffer is just a waste of time. Also, DEC suggests that very small machines (smaller than 256K?) may be better off in using small buffer sizes. (If you find this to be the case, you can change MAPBFS at the beginning of PASIO.MAC to 1 for your site.) 43 3.8.4.3 TTY mode[3] DEC has supplied a standard "line editor" for use with the terminal. This is what carries out ^U, ^R, rubout, etc. Most users will want this line editor activated when a program is reading from the terminal. Thus Pascal will handle all terminals in TTY mode. This routine uses the TEXTI jsys, which calls the standard DEC line editor. With this routine in effect, you can use rubout, etc., until you type an "activation character" (usually carriage return). At that point the entire line is passed to the program. The complete list of activation characters is: , , altmode, ^G, ^L, and ^Z. Also in this mode when you type a ^Z, end of file is generated. For output, the bin jsys is used, so that characters appear on the screen immediately when you output them. (I.e. no buffering is done.) This is somewhat inefficient, but avoids requiring BREAK whenever you want output to show up immediately. This routine can be used for any device, but there is no obvious use except for a device where TEXTI provides editing. For this routine the file is openned with the logical byte size. However the bytes are packed into the buffer as 7 bit bytes, so it is not clear what effect a byte size other than 7 bits has. If this mode is specified for a non-text file (i.e. record I/O), byte I/O is actually used. Since input is buffered, the comments above about buffering hold for the input side. I.e. BREAKIN would be needed if you were using this with tape and did a rewind, and errors are delayed as described above. 3.8.4.4 Null mode[4] This routine is used only with device NUL:. It does nothing on output, and simulates end of file on input. It is far faster than actually using the I/O jsys's for device NUL:. It should not be used for other devices unless you wish them to do no I/O. Random access is legal, but has no effect. CURPOS always returns position 0. 44 3.8.4.5 Buffered mode, by word[5] This routine is designed to handle devices that are capable of binary I/O as efficiently as possible (except for disks, which are handled a bit more efficiently by Pmap mode). It is legal only for transfer in a single direction. I.e. Update is illegal. It uses a one page internal buffer to store up characters as they are being input or output by the program. When the buffer fills (output) or is emptied (input), a single SIN or SOUT is done to transfer the whole buffer. The comments above about buffered I/O all apply (BREAK, BREAKIN, and delay of errors on input). Random access is illegal in this mode. Note that the device is openned for 36 bit bytes in this mode. This makes transfers a bit more efficient, and allows the runtimes to detect line numbers (which cannot be seen in modes where the file is open for 7 bit bytes). The byte size specified by the user is simulated. Thus as far as the user is concerned his byte size was used. However this means that if the user has specified a record size for a magtape, his number will be taken as the number of words per record. It also means that transfers will be rounded up to the next integral word. This mode is used by default for magnetic tape drives. 3.8.4.6 Buffered mode, by bytes[6] This mode is similar to buffered mode by words. Transfers are still buffered by a one page buffer. However the file is openned using the byte size specified by the user, and the actual transfers are done in terms of such bytes. In most cases this is not an advantage, so this mode is used by default only for devices that I suspect may act strangely when a byte size of 36 is specified: line printer and card reader. It is possible to do random access in this mode if it is specified for disk files, however you are limited to unidirectional transfers (no Update), so it seems to have no advantage for disk. The comments above about buffering apply to this mode (BREAK, BREAKIN, and delay of errors). This mode is used for disk files when append or rewrite is specified 45 and the user does not have read/write access to the file. 3.8.4.7 Record mode [7] This mode is intended mostly for use with variable-size records on magtapes, or with labelled tapes. What it does depends upon whether you are dealing with a text file or a binary file. With a text file, each line is a record on tape. With a binary file, each Pascal record is a record on tape. With binary files, this is the same as byte mode except that SINR and SOUTR are used for record I/O rather than SIN and SOUT. Each GET reads one record from the tape into ^. If the record on the tape is shorter than the size of the Pascal record into which it is being read, the extra bytes in the Pascal record are not changed. If it is longer, the extra bytes on the tape are lost. In any case, each GET reads a new record. To see how many bytes were transferred, call the function LSTREC(file) [See 3.16.] Note that the record size specified to the monitor (in SET RECORD command or mtopr jsys) sets a maximum to the record size you can transmit. As long as that maximum is large enough, it has no other effect. Each PUT writes one record from the Pascal file buffer ^ onto the tape. See 3.8.6 for how to specify the size of the record. If you do nothing special, the record size will be that of the Pascal record. (If variants are involved, it will be the size of the biggest variant.) Note that if a byte size other than 36 is specified, the monitor will pack the bytes into full words, and round the number of bytes written up to the next highest full word. Also, there is a minimum record size of 4 words. If you are dealing with text files, each line will be made into a record on tape. Thus WRITELN will force out a record, and EOLN will be true when you reach the end of an input record. Note that no explicit end of line characters occur when using this mode. Thus '/E' as an option has no effect in this mode. To specify the structure of a tape file, use "attributes". For example, "MT0:;FORMAT:F;RECORD:80; BLOCK:800" will result in fixed records of 80 bytes, blocked 10 to a 46 block. Pascal makes every attempt to implement all meaningful combinations of attributes. You can often use attributes that will not work in the EXEC's COPY command. When format F is in use, Pascal will fill records with blanks to reach the specified record size if necessary. Note that even in Pascal you cannot write an EBCDIC tape. This is a limitation of the monitor about which Pascal can do nothing. Note that the I/O procedures GETX and PUTX have no obvious meaning for tape I/O and are illegal in this mode. The mode for tape I/O is chosen as follows: If you have specified an I/O mode, that mode will be used. No other defaulting will be done. In this case, the record format will be chosen by the monitor if you have not specified it in an attribute. If you have not specified an I/O mode, then the Pascal magtape default module is called. The defaulter must first find the record format. For an input file, the format is gotten from the tape label. For an output file, Pascal first checks to see if you have specified ;FORMAT in the filename. If so, the format you specified is used. If you do not, output files are always written with format U. Once the format is known, the defaulter can then choose the I/O mode. For format U, a different mode is used for output and input. Output uses buffered mode, by words. Input uses buffered mode, by bytes. The usual byte size is used (7 for text, 36 for binary, or whatever you specify by '/B:nn'.) For all other formats, record mode is used. Note that record formats other than U may be a bit hard to use with binary files. DEC forces 8 bit bytes internally for formats other than U. Pascal uses bytes in a way that may have turned out to be wrong: each word of a binary data structure is written as one byte. This is fine with format U, with a byte size of 36. But if you try to use it for one of the other formats, the low-order 8 bits of each word will get written in a tape frame. This may not be what you want. 3.8.4.8 Other modes I am currently considering how tapes should be handled. I suspect that some additional modes for IBM format tapes will be added eventally. Note that if you specify a number corresponding to an 47 unimplemented mode, byte mode will be used. 3.8.5 A note on byte sizes in files The documentation above describes I/O as occurring in bytes. On the DecSystem-20 a word contains 36 bits. It may be divided into smaller units called bytes. The bytes will be left justified, and will not be split across words. Thus 7-bit ASCII text is stored in 7-bit bytes, 5 to a word, left justified. I/O gets bytes from the monitor (possibly with intermediate buffering by the runtimes, for efficiency) one at a time. There are two types of PASCAL I/O: text and record. Text I/O is what you get when you use TEXT or FILE OF CHAR. The PASCAL runtimes assume that every GET from a text file returns one ASCII character (and that every PUT puts out one character). Internally GET just gets one byte from the file. Thus everything works nicely for the usual kind of file, assuming you accept the default byte size of 7 bits. Since the usual file has characters packed 5 to a word, getting one byte out of the file does indeed get one character. However, you can change the byte size. If you used a byte size of 8 bits with a normal file, there would obviously be trouble, since the 5 characters stored in each word would be distributed over 4 bytes of 8 bits each. The usefulness of a byte size of 8 is for industry-compatible magnetic tapes. Since these tapes in general contains 8-bit ASCII or EBCDIC, the monitor packs 8-bit bytes 4 to a word in the monitor buffer. In the case of ASCII the high-order bit of the byte is parity, and may be ignored. So to read such a tape one must (explicitly or implicitly) specify a byte size of 8 bits, so that when GET gets a byte the byte is really one character. A byte size of 36 bits would make sense for text files only in some wierd case where you have data packed one character per word in the file. That is because a byte size of 36 bits means that each GET returns one word, with no unpacking. Because text I/O is assumed to involve ASCII characters, each byte is truncated to the 7 low order bits immediately after input. Thus in case parity is included as an 8th bit, it will not mess up the runtimes. Should you need to see the parity, you will have to make other arrangements, probably by using record I/O with a record type PACKED ARRAY OF 0..377B. 48 Record I/O is any I/O not involving a FILE OF CHAR. In this case data is transferred from the monitor buffer to the PASCAL file buffer (FILE^) by putting each byte gotten from the monitor into a separate word in the PASCAL record. Thus a byte size of 36 bits causes the PASCAL record to have the same structure as the file. If a byte size of 7 bits were used, each 7-bit byte in the file would be moved to a separate word in the PASCAL record. Thus it would be appropriate if the file is a usual text file, but the PASCAL record is an unpacked array of CHAR. Note that the I/O runtimes do not check the byte size to be sure it makes sense. So it would be perfectly possible for you to use a byte size of 7 bits to read into a PACKED ARRAY OF CHAR, even though that would make no sense. It makes no sense because it causes one 7-bit byte from the file to be put in each word of the PACKED ARRAY. But a PACKED ARRAY OF CHAR should have 5 characters in each word. Except for special effects, one usually uses a byte size of 36 for record I/O. Then each input word is moved into a word in the record, and you can do what you like with it. You can now understand why the default I/O modes use 7 bit bytes for text I/O and 36 bit bytes for record I/O. 3.8.6 Variable Record Formats Standard PASCAL has a problem when it tries to read files created by non-PASCAL programs. Every call to GET or PUT transfers a fixed number of words and puts it into the buffer variable. This is fine for files whose records are all the same length and format, but for other files it is a mess. To avoid these problems, we have extended the format of the GET and PUT, to allow GET([,]*[:]), or the equivalent for PUT. If the file type is a variant record, you may use FILE,VAR1,VAR2... to specify the exact variants. This is exactly like the syntax for NEW, as documented in the Revised Report. Furthermore, if the file is an array, or the selected variant ends in an array, you may specify the number of elements in the array to be used. For example we might have 49 TYPE REC=RECORD CASE BOOLEAN OF TRUE: (INT:INTEGER); FALSE: (J:BOOLEAN,K:ARRAY[1:100]OF INTEGER) END; VAR F:FILE OF REC; BEGIN RESET(F,TRUE); %TRUE to prevent the implicit GET\ GET(F,TRUE); %TRUE to select the variant "TRUE"\ GET(F,FALSE,5); GET(F) END. The first GET would read one word of the file, since the variant TRUE requires only one word. The second GET would read 6 words of the file. One word is for the Boolean J, and 5 for the first five elements of K, since the argument 5 specifies that only 5 are to be used. The final GET would read 101 words from the file, which is the space required for the longest possible variant. Note that the argument after the colon is an index into the array. (I.e. if the array is [-5:1], 1 means all 7 members of the array.) If you do not know in advance the size of a record being read, there are two ways to handle the problem. If you know you will always be using mag tape, you can open it in mode 7 (record mode). Then each GET will get exactly one record, and you can then look at it to see what kind it is. (You can use LSTREC [See 3.16.] if you can't tell the size of the record by looking at it.) However on other devices there is no builtin record structure, so you must usually depend upon reading an initial piece of the record that tells you either the record type or length. Then you know how long the record is, and can read in the rest. For example, the record might begin with a type code. Obviously we have to read the code before we can specify how to read the rest of the record. Thus there is a procedure, GETX, to continue reading a single record. With GETX, the data transfer begins when the previous GET stopped, rather than at the beginning of the record. Suppose our record is a simple array of integers. Then we might do GET(file:1) to read a length code, and GETX(file:L) to read the rest of the record. Note that after the GETX the code is still present in the buffer as the first member, so the L counts it. GETX is an alternative to using "record mode". It is not meaningful to use both methods of detecting 50 record length at the same time. (Indeed GETX will result in an "Illegal function" error in record mode.) Also, sometimes you will need to know how much space a given variant will take up. Of course you can calculate this if you know the way PASCAL allocates memory. But it is more elegant to let PASCAL do the calculation for you. RECSIZE(file) will return the number of bytes in a record for the file. All the variant and length options of GET and PUT can be used, so that the size of any version of the record can be calculated. E.g. RECSIZE(file,TRUE) returns the size of the TRUE variant. In principle, GET, GETX, and other functions listed later, provide all the facilities needed to do arbitrary I/O with variable record sizes. However it is clear that they are not very convenient. The runtimes have been structured in such a way that it is easy to add what IBM would call "access methods". I.e. one could add a set of runtimes that know how to read and write EBCDIC tapes, and have them called by the normal GET and PUT. To do so one would simply add a procedure USEEBCDIC(file) that would change a dispatch vector so that all I/O for the specified file used the EBCDIC routines. No such extra methods are currently written, but we expect they will be eventually. 3.9 RENAME RENAME(file,newname) may be used to rename an existing file. The newname argument is treated the same way as the name argument for RESET, etc. It specifies the new name for the file. If the operation works, EOF is set to false, otherwise to true. Of course unless you have set user error handling for this file, you will get a fatal error message if it fails. This procedure does RNAMF. The original file should at least have its jfn in existence. Usually you will have done RESET, etc., on it. It may be closed or not. (RENAME closes it if necessary.) As a sideeffect, the jfn of the original file is released, and the current jfn of the file is set to one associated with the new name. RENAME to a blank name will not delete a file (as it did in Tops-10). Use DELETE instead. 51 3.10 DELETE DELETE(file) will delete the mentioned file. It need not be open, but must have a jfn associated with it. (This will be the case if it was mentioned in the PROGRAM statement or if RESET, REWRITE, etc., were done on it.) It need not be closed, as DELETE will close it first if necessary. 3.11 UPDATE You may open a file for updating by using the runtime UPDATE. When a file has been openned for UPDATE, both GET and PUT (or READ and WRITE) may be used with it. A single position is maintained within the file, which is advanced by both GET and PUT. There is also a special routine, PUTX, which will rewrite the last record read by GET. It is exactly equivalent to repositioning the file to the place it was at the beginning of the last GET and doing a PUT. Update has the same arguments as RESET. However, the third argument is ignored, as files are always openned by UPDATE in "interactive" mode. When you have opened a file with UPDATE, you can then use the procedure PUTX, in conjunction with GET. To update a record, you would do GET to read it, change the contents of the PASCAL buffer variable corresponding to the file involved (i.e. FILE^), and then do PUTX(FILE). Note that PUTX takes only one argument, the file. It always rewrites exactly the same record as was read by the last GET (or the whole record read by a combination of GET and GETX -- See Section 3.8.6) Note that EOF is normally false for an update file. It is set to true if an error occurs (and the user has specified user error handling) or you try to read beyond the end of file. It is perfectly legitimate to write beyond the end of file, however. Doing so extends the size of the file to include what is written. 52 3.12 Random access It is possible to move around a file randomly when it is on a direct-access device (i.e. disk or some equivalent). The procedures that implement this use a byte serial number to keep track of their position. I will henceforth call this the "position". This number refers to the number of bytes between the beginning of the file and the record in question, so the first record begins at position 0. The position is absolute within the file, so if one or more pages are missing from the file, gaps are left in the position numbers. Note that the unit of measure is the byte. This corresponds to one word in the PASCAL buffer. However in the file itself the bytes may be packed, depending upon the I/O mode in which it was openned. TEXT files (FILE OF CHAR) are stored 5 bytes per word by default. Other files are one byte per word by default. CURPOS(FILE) returns the current position index of the file. This is the position at which the next record to be read or written would begin. When a file has just been openned, the position is, of course, 0. Note that CURPOS may give an error if used with a file not on disk. CURPOS is a builtin function. SETPOS(F,B) sets things up so the next GET(F) or PUT(F) will get or put the record that starts at position B. If the file is an input file, SETPOS does an implied GET. To surpress this implied get, use an extra non-zero argument, e.g. SETPOS(F,6,TRUE). SETPOS is also a builtin procedure. If you SETPOS to a position beyond the end of file, the next read will set EOF. If you really wanted to read, you should SETPOS to a more reasonable position. This will reset EOF. It is perfectly legal to write beyond the end of file, however. Doing so extends the length of the file. If you SETPOS to a position beyond the end of the file, it is possible to leave non-existent pages between the old end of file and the newly written part. These should cause no trouble to PASCAL (such pages are treated as full of binary zeroes), but may for programs written in other languages. 53 3.13 APPEND Occasionally one wishes to append new data onto the end of an existing file. The monitor has facilities for doing this without recopying the existing data. Proper use of these facilities also allows one to append data to an append-only file. The procedure APPEND implements this facility in PASCAL. It has exactly the same parameters as REWRITE. The difference between it and REWRITE is that the file mentioned must already exist and writing begins at the end of the existing data. The arguments are exactly the same as with REWRITE. APPEND is simulated for pmap'ed disk files by setting the file to the end of file pointer. For other I/O modes the openf has the append bit set and the monitor is assumed to do something appropriate. 3.14 Wildcards If you wish to handle file names with wildcards in them, you should specify :* after the file name in either the program statement or the RESET, etc., call. This simply allows wildcards to be typed accepted by the file name parser. To actually implement the wildcards, you must have a loop in your program. Each time through the loop the file should be openned with the file name field defaulted in the RESET, etc. This causes the existing jfn to be used. To advance the jfn to the next file, the predeclared function NEXTFILE(file) should be used. It will return 0 if no more files remain in the group described by the file spec. Otherwise it returns the bits returned by the GNJFN jsys. Thus a typical program might look as follows: 54 PROGRAM WILD(INFILE:*-); VAR INFILE:TEXT; BEGIN REPEAT RESET(INFILE); ... UNTIL NEXTFILE(INFILE) = 0 END. Note that when Nextfile returns 0 it releases the jfn. This is done automatically by the monitor, and cannot be prevented by Pascal. 3.15 Including external text Users who write complex multi-file programs sometimes want to put TYPE, CONST, and EXTERN declarations in a common file used by all programs. Such a file is used by putting the INCLUDE statement at the beginning of any block in the program. The syntax is INCLUDE 'file spec'; The quotes are required. This statement may occur anyplace the token TYPE would be valid. The text in the file is included in the source of the program, replacing the INCLUDE statement. The syntax analyzer is then reset so that a TYPE statement would still be legal following the INCLUDE. More than one file may be included by INCLUDE 'file spec','file spec', ...; or by more than one INCLUDE statement. An included file consists of possible CONST, TYPE, and EXTERN declarations, with exactly the same syntax they would have at the beginning of a program or block. The file should terminate in a period. But if the thing just before the period is an integer, you will need a space or new line before the period so the scanner doesn't think it is part of the number. 3.16 Miscellaneous I/O Functions The following are provided for completeness. They will not be useful 55 for most people. Those that are not explained below usually just do a monitor call with the same name. See the Monitor Calls manual for such cases. They must be declared external, as shown below, but they are built into the PASCAL library. Note that the symbol FILE is legal in the declaration of a procedure, as shown below. It will match a file of any type. This sort of declaration, which cancels some of the normal type checking, should be used with great care. Those functions and procedures that do not require EXTERN declarations are listed below under Standard Functions and Procedures. PROCEDURE QUIT; EXTERN; Causes a normal program exit immediately. Closes all files. FUNCTION TO6(A:ALFA):INTEGER; Converts 6 characters in A to sixbit. PROCEDURE FROM6(I:INTEGER;VAR A:ALFA); EXTERN; Converts from sixbit to ascii PROCEDURE CLREOF(VAR F:FILE); EXTERN; Clears PASCAL's EOF indicator. You must do this if you want to proceed after an end of file or some error for which you are enabled. Sets EOF to false for input, true for output. Clears the error indication used by ERSTAT, so the next ERSTAT will return 0. Also does STSTS to clear the EOF and error flags in the monitor, assuming that there is a physical file open. Not necessary for clearing end of file in random access files, as moving the file pointer will clear EOF. FUNCTION ERSTAT(VAR F:FILE):INTEGER; EXTERN; If the user specified that he wanted I/O to continue in spite of errors, this function must be used to check for whether an error occurred. It will return 0 if no error has happened, otherwise the error bits from a GETER jsys. Only the most recent error code generated 56 is returned. PROCEDURE ANALYS(VAR F:FILE); EXTERN; If an error occurred with file, prints an official-looking error message. No effect if no error occurred, or if the file is connected to a string with STRSET or STRWRITE. FUNCTION LSTREC(VAR F:FILE):INTEGER; EXTERN Returns the size of the last record read or written. This is useful mostly in the case of reading records in "record mode" (mode 7), as there is no other way to find out how big the record was that you just read. It is implemented in all modes for record I/O, however. It might also be useful in case an error occurs when reading a record and user error recovery is enabled. It will return the numbers of bytes actually read, according to the monitor. Experience shows that in this case the number is often garbage, however. FUNCTION CURJFN(VAR F:FILE):INTEGER; EXTERN Return the indexable file handle for the file (0 if none). This is a full-word quantity with the JFN in the right half, and various bits in the left half. PROCEDURE SETJFN(VAR F:FILE,I:INTEGER); EXTERN Set an indexable file handle for the file. First does RCLOSE if open. Note the you may put bits in the left half if desired, though one would think that normally a simple JFN would be used. If the jfn came from another file by CURJFN, that file should be CLOSE'd first, in order to release any I/O buffers. (A fatal error may occur if this is ignored.) PROCEDURE GETPAGES(N:INTEGER;VAR P:INTEGER;VAR PP:PAGEPTR); EXTERN 57 Gets a block of memory: N is the number of pages to get, 1 - 36. P is set to the page number, 0 - 777B. PP is set to a pointer to the page [pageptr = ^mempage; mempage = array[0:777B] of integer] Always use this if you need to allocate memory. There are problems in trying to do it yourself. The memory comes from the I/O buffer area. PROCEDURE RELPAGES(N:INTEGER;P:INTEGER); EXTERN; Returns a block of memory to the runtimes: N is the number of pages in the block, 1 - 36. P is a page number, 0 - 777B. The page should have been gotten from GETPAGES. The procedures NEWPAGE and RETPAGE are still in the runtime library at the moment, but should not be used by new programs. They are exactly like GETPAGES and RELPAGES, except that the do not have the argument N. They get and return a single page. See sections 5.4 and 5.5 for procedures relating to interrupts and synchronization. (They are also in EXTERN.PAS.) Beware that programs using these things are of course not transportable to machines other than the DEC20!! 3.17 The structure of a PASCAL program Some users will need to know the exact structure of a PASCAL program in memory. PASCAL produces two-segment programs, which are potentially sharable. Thus at the start of the program, the high segment contains all of the code and certain constants, and the low segment contains global data. There are three other data areas which are created during execution of the program: the stack, the heap, and I/O buffers. I/O buffers are created mainly for disk files. Since the main use is 58 as pages for the PMAP jsys, the buffer area is always allocated in page units. The buffer area begins immediately above the end of the low segment data area. .jbff always points to the beginning of the next page that will be allocated. A bit map is maintained to indicate free pages. If you ^C and restart the program, all pages from the end of the low segment data to .jbff are deallocated, to guarantee that all PMAPd disk files can be closed by the initial RESET. The heap contains all space allocated by the NEW function. It is located immediately below address 400000B, and expands downwards. The stack contains parameters, return address for routines calls, and all local variables for procedures. The stack is allocated in pages located ABOVE the high segment. The entry code compiled into every PASCAL main program does some things that you may find useful to know about. First, it has PORTAL instructions at the starting address and at the next address. This means that a PASCAL program may be made execute-only, and be started either normally or with a run offset of 1. Second, there is a global variable %CCLSW, which is set to 0 if the program is entered normally, and to 1 if it is entered with a run offset of 1. It should be possible to control-C a PASCAL program and restart it. However, the user should be aware that global variables will not be reinitialized in this case. (In particular INITPROCEDURE's will NOT be done again, as they are not really executable code at all.) 59 4. PASCAL Debug System (PASDDT) A PASCAL program may be run with the PASCAL Debug System by using the monitor command DEBUG instead of EXECUTE. (Successful debugging also requires that the program be assembled with /DEBUG, but this is on by default.) The system can be used to set breakpoints at specified line numbers. When a breakpoint is encountered, program execution is suspended and variables (using normal PASCAL notation) may be examined and new values assigned to them. Also additional breakpoints may be set or breakpoints may be cleared. It is helpful to have a listing of the program available as the system is line number oriented. The previous paragraph assumes that your EXEC has been modified to handle DEBUG properly for PASCAL. If it has not, you must use EXEC SYS:PASDDT,. To modify the EXEC, you should either make it add sys:pasddt for the debug command, or put in /DEBUG:PASCAL and modify LINK to load (but not start) PASDDT in this case. Should you need to run LINK explicitly, rather than using the monitor DEBUG command, PASDDT is included by loading the file SYS:PASDDT. 4.1 Commands The commands described here can be given when the system enters a breakpoint. When the program is executed an initial breakpoint will be entered before the main program begins. This will be shown by a message > STOP AT MAIN BEGIN >> Additional breakpoints are set by 60 STOP where is of the form line#/page# or just line# which is equivalent to line# on the current page. An example is 120/3, line 120 on page 3. A maximum of 20 breakpoints may be set. PASDDT keeps track of the "current line". The current line is the one most recently printed out. (In the case of printouts showing a range, it is the first one in the printout.) This line can be referred to by a simple star (*) Hence "STOP *" will be equivalent to "STOP 3/5" if line 3 on page 5 is the current line. If you type a line number with no page number, the current page is supplied. So if the current line is 3/5, then "STOP 100" referrs to 100/5. To find out what the current line is, type * = The breakpoint is cleared by STOP NOT STOP NOT ALL will clear all of them. The breakpoints set may be listed by STOP LIST Variables may be examined by the command = 61 may be any variable as given by the PASCAL definition (except files). In particular it may be just a component of a structured type, or the whole structure itself. In the case of arrays, adjacent elements that are identical are displayed in a compressed form. A new value may be assigned to a varible by := The assignment follows the usual type rules of PASCAL. PASDDT has access to your source file (assuming that it is still there when you get around to running the program). Whenever you reach a breakpoint, the portion of your source file around the breakpoint will be displayed. Often it is useful to look at other parts of your program, in order to decide where to place breakpoints, or for other reasons. To look at your program, there are two commands: TYPE [] FIND [] [] TYPE allows you to type a line or range of lines in the currently open file. (Use OPEN to change which file you are talking about, as described below.) FIND allows you to search for any text string in the current open program. E.g. >> find 'foo' will look for the next appearance of foo in your file. To find the second appearance of foo, use 62 >> find 2 'foo' Note that the FIND search starts at the line after the current line (.). PASDDT can be used to follow execution of your program line by line. This is called "single stepping". Once you start this mode of execution, each time you hit the carriage return key, one line of your program will be executed. The commands relevant to single-stepping are: STEP STEP causes the next line of your program to be executed. Since you often want to do this for many lines, it is rather inconvenient to type the word "STEP" for each line. Thus once you have done one step command, PASDDT enters a special mode where a simple will cause the next line to be executed. - do one line in single-step mode This mode is announced by changing the prompt from the usual ">>" to "S>". Note that all the normal PASDDT commands are available as usual. The main difference that S> mode makes is that is available as an abbreviation for STEP. You get out of single step mode by doing a normal END, i.e. by proceeding your program in the normal way. One other command is available in single step mode: - continue until end of procedure 63 When you are single-stepping and come to a procedure call, the single-stepper will show you everything that goes on within the procedure. Sometimes you really don't want to see the inner workings of the procedure. You just want to continue watching the program from the point where the procedure returns. An (sometimes labelled ) in single-step mode will cause the stepper to finish the current procedure silently. The next time you hear from the debugger will be when the procedure exits (unless of course you have placed a breakpoint within the procedure). We advise all users to experiment with the STEP command, since single-stepping is the single most effective debugging tool we know. The current active call sequence of procedures and functions is obtained by TRACE The names of the procedures and functions together with their line numbers are printed in reverse order of their activation. TRACE may optionally be followed by a number, which will be the number of levels for which information is printed. You can display the values of all variables current active in the program by using the command STACKDUMP This will give the same information as TRACE, and additionally at each level display the names and values of all local variables. As with TRACE, you may follow STACKDUMP by a number, and only that many levels will be displayed. You may also follow it with a filename in quotes. The information will be put in that file, instead of dumped to your terminal. Program execution is continued by the command 64 END The program will run until another breakpoint is encountered. The breakpoint is announced by > STOP AT >> Should you have more than one module (presumably because you have loaded both a main program and a file of external procedures), special care is required. At any given moment only one module is accessible to the user. That means that attempts to refer to variables or line numbers in another module will meet with errors. To change modules use the command OPEN The module name is the name in the program statement for the corresponding file. If no program statement occurs, it is the name of the .REL file. Whenever PASDDT is entered, it will tell you the name of the module that is open initially. In the case of a break, the module in which the broken line occurs is openned. Sometimes there will be variables of the same name at several levels. In this case you may find it impossible to refer to a variable at a higher lexical level than the one where the break occurs. The command OPEN will set the context in which names are interpreted to any depth desired. The depth you type is the name as shown on TRACE or 65 STACKDUMP. If you want to stop debugging, the command QUIT is sometimes useful. It is somewhat cleaner than control C-ing out of the debugger, as it closes all files and does other normal cleanup. Note that if you QUIT, partially written files are closed, and thus made to exist. Control C will not make such files visible. You can control the verbosity of PASDDT with the command SHOW This controls the number of source lines shown when you enter a break. 0 is legal if you don't want to see any. 4.2 Asynchronous Interrupt If a program goes into an infinite loop it may be stopped temporarily by typing ^D (possibly twice). This will enter the PASCAL Debug System. This interrupt is announced with the message > STOP BY DDT COMMAND > STOP IN : >> If you happened to stop the program when it is in the runtimes, you will get an invalid result, probably line 0. However the other functions of PASDDT should still work in this case. In particular 66 TRACE will tell you what procedure you are in. The END command will resume your program. The only case I can think of where this does not work is if you are doing your own interrupt handling and get stuck in a priority 1 interrupt handler, or somehow disable the ^D interrupt (e.g. by supplying your own interrupt handler for channel 35). In this case you can halt the program by typing ^C^C, and enter PASDDT with the DDT command. However in order for everything to work properly PASDDT has to be told where you are in the program. So after typing ^C^C, type ^T and look at the PC it gives you. Deposit the PC in location 130, and then type the DDT command. Everything should now work right. However the END command may not properly resume your program if it was in the middle of doing a JSYS when you interrupted it. 4.3 Standard Procedures and Functions The following standard procedures and functions (described in the Revised PASCAL Report) are implemented. Standard Functions Standard Procedures 67 ABS GET (See 2.6 and 3.8.6) SQR PUT (See 3.8.6) ODD RESET (See 2.3 and 3.8.1) SUCC REWRITE (See 2.3 and 3.8.1) PRED NEW ORD READ CHR READLN (See 2.7) TRUNC WRITE (See 2.4) ROUND WRITELN (See 2.4) EOLN PAGE EOF PACK SIN UNPACK COS EXP LN SQRT ARCTAN Additional mathematical functions are available: ARCSIN SIND ARCCOS COSD SINH LOG COSH TANH The following functions may be used to simulate the missing ** operator. They must be declared EXTERN. FUNCTION POWER(X,Y:REAL):REAL; EXTERN; - X ** Y FUNCTION IPOWER(I,J:INTEGER):INTEGER;EXTERN - I ** J FUNCTION MPOWER(X:REAL;I:INTEGER):REAL;EXTERN - X ** I Additional standard functions: 68 CURPOS(file) returns the current position in a file. See section 3.12. Only valid for files on random access device. (type integer) DATE result is a PACKED ARRAY [1..9] OF CHAR. The date is returned in the form 'DD-Mmm-YY'. NEXTFILE(file). Advances to the next spec for a wildcard file. See section 3.14. RANDOM(ignored). Argument is an integer, which is ignored. Result is a real number in the interval 0.0 .. 1.0 RECSIZE(file) returns the record size of the file. One may also specify a particular variant whose length is to be returned. See section 3.8.6 for details. (type integer) RUNTIME elapsed CPU time in msec (type integer) TIME current time in msec (type integer) Additional standard procedures: APPEND(file,name,...). Like REWRITE, but extends an existing file. See section 3.13. BREAK(file). Forces out the output buffer of a file. Should be used before magtape positioning. It is not needed for terminals to force out messages. See section 3.8.4. BREAKIN(file,noget). Clears the input buffer count. Must be used after magtape positioning for buffered input. If noget is omitted or zero (FALSE), a GET is done on the file after the buffer count is zeroed. May also trigger delayed errors. See section 3.8.4. CLOSE(file,bits). Close file and release its channel. See section 3.6. DELETE(file). Delete file. See 3.10. DISMISS(file). Abort creation of a file. See section 3.10. 69 DISPOSE(pointer,variant,...). Return a record to the heap. See section 1.3. (Some editions of Jensen and Wirth include this as a standard procedure.) GETINDEX(file,index). If file is open on a string (STRSET or STRWRITE), sets index to current index into the string. (See section 3.1.) GETLINENR(file,lineno). Lineno must be a packed array of char. It is set to the last line number seen in the file. If no line numbers have been seen '-----' is returned. ' ' is returned for a page mark. If file is omitted, INPUT is assumed. JSYS(jsysnumber, ...). Arbitrary monitor call. See section 3.2. MARK(index). Save state of the heap. See 3.7. PUTX(file). Rewrite record in update mode. See 3.12. RCLOSE(file). Close, releasing jfn. See 3.6. RELEASE(index). Restore saved state of the heap. See 3.6. RENAME(file,name,...). Rename an open file. See 3.9. SETPOS(file,position). Move in random access file. See 3.12. STRSET(file,array,...). Open input file on array. See 3.1. STRWRITE(file,array,...). Open output file on array. See 3.1. UPDATE(file,name,...). Open random access file for revising in place. See section 3.11. Although it is not exactly a procedure or function, some explanation should be given of the MOD operator. X MOD Y is the remainder after dividing X by Y, using integer division. The sign of the result is the same as the sign of X (unless the result is zero, of course). Note that this is a different definition than the one used by mathematicians. For them X MOD Y is always between 0 and Y-1. Here it may be between -(Y-1) and +(Y-1), depending upon the sign of X. This implementation is used for consistency with the Cyber 70 implementation, which is the semi-official standard. Note that SAIL (and some other widely used languages) also use this perverted definition of MOD. 4.4 External Procedures and Functions A procedure or function heading may be followed by the word EXTERN. This indicates to the compiler that the routine will be supplied at load time. In addition it may be specified that the routine is a PASCAL, FORTRAN, ALGOL or COBOL routine. PASCAL is assumed if no language is specified. The language symbol determines how the parameters are passed to the external routine. The relocatable file also contains information to direct the loader to search the corresponding library on SYS:. Example: PROCEDURE TAKE(VAR X,Y: INTEGER); EXTERN FORTRAN; The PASCAL compiler can deal with two kinds of files: main programs and files of external procedures. A main program contains a single program with procedures local to it. There must be exactly one main program involved in any load (i.e. in one EXEC command). Any procedures not present in the main program must be declared EXTERN, as explained above. They must then be defined in a file of external procedures. A file of external procedures has the following differences from a main program: (1) There is no top level code. The period follows the last top level subroutine. For example: var i,j:integer; procedure a; begin i:=1 end; procedure b; var y:integer; begin y:=1 end. In a main program, there would be top level code after procedure 71 B. (2) The top level procedures, A and B in the above example (but not any procedures defined within A or B), have their names declared in a special way. This makes them accessible to other programs. Note that only the first six characters of the name are significant to other programs that access these procedures as EXTERN. (3) A file of external procedures must either have a comment of the form (*$M-*) at the beginning, or be compiled /NOMAIN. (These both do the same thing.) You may combine several .REL files containing external procedures to form a library. If you want to search this in library search mode, it will be necessary to specify entry points for each module. Each source file will compile into a single module. The module name will be the program name specified in the PROGRAM statement, if there is one, otherwise the file name. If you do nothing special, that module name will also be used as the only entry for the module. If there is no top level procedure with the same name, that name will be assigned as an alternate name for the first procedure in the file. To get more than one entry point, you must use a special form of the PROGRAM statement, e.g. PROGRAM TEST,A,B; for the above example. This declares TEST as the module name, and A and B as the entry points. Usually you should list all of the top level procedures as entry points, although this is not required. Note that these entry points are only needed for library search mode. Even without this special PROGRAM statement the procedures A and B could be accessed as EXTERN procedures by a separate main program. Note that the normal form of program statement, e.g. PROGRAM TEST (A, B);, is illegal for a file of external procedures. All files that are to be initialized at the beginning of the program must be declared in the program statement in the main program. The form that declares only the module name, e.g. PROGRAM TEST;, is legal, however. It is possible for one file of external procedures to call procedures defined in another file of external procedures. As usual, they must be declared as EXTERN in each file where they are to be called. 72 Assume the files TEST.REL, AUX1.REL, and AUX2.REL are to be loaded, along with a routine from the library NEW:ALGLIB.REL. Execution is accomplished by: EXEC TEST,AUX1,AUX2,NEW:ALGLIB/LIB Note that this command would cause any of the programs that had not been compiled to be compiled. 73 5. Miscellaneous 5.1 Implementation Restrictions a) A maximum of 20 labels is permitted in any one procedure. (This is an assembly parameter. The whole restriction may be removed easily, at the cost of some extra local fixups in the .REL file.) This includes both labels declared in the LABEL section and those not so declared. (Labels need be declared only if they will be the object of a non-local goto.) b) Printer control characters are not available. A new page is started by a call to the standard procedure PAGE. c) Procedures and functions may be passed as parameters to procedures and functions, as described in the Revised Report. We have not modified the syntax to allow declaration of the arguments to such parametric procedures/functions. (Prof. Nagel's version contains such a modification.) Also, note that when a parametric procedure/function is called, all of the arguments passed to it must fit in the accumulators. Normally this allows 5 arguments, although certain arrays count as 2 arguments, and functions allow one extra argument. An appropriate error message is given if this constraint is violated. d) [not applicable to Tops-20 implementation] e) Only comparisons described in the PASCAL manual can be done. There were serious problems with the earlier attempt to allow comparison of arbitrary records and arrays. f) Sets may only be defined on types or subranges of types having 72 or fewer members. With subranges of integers the set may only include 0 to 71. With enumerated types it may include only the first 72 members of the enumeration. Special provisions are made to allow sets of CHAR. The problem is that there are 128 possible ASCII characters. This problem is "solved" by treating certain characters as equivalent. In particular, lower case letters are treated as equivalent to the corresponding upper case letter. And all control characters except for the tab are treated as equivalent. Thus ['a'] is exactly the same 74 set as ['A']. (One of those is lower case and the other upper case.) Similarly 'a' in ['A'] will succeed. And ['^X'] is the same set as ['^B']. g) WRITE(F,X,Y) is supposed to mean to write the value of F, X, and Y onto the file OUTPUT, except that if F is a file, X and Y should be written onto F. However the case WRITE(F^,X,Y) is hard to handle. It should cause F^, X, and Y to be written to output. (F^ is the contents of the buffer associated with file F.) But this case cannot be recognized by a LR1 scan. Thus in this compiler it is treated as an error. h) WRITE(TTY,X,Y) actually writes to the file TTYOUTPUT. This mapping of TTY into TTYOUTPUT occurs at compile time. So if you pass the file TTY to a procedure as parameter F, WRITE(F,X,Y) is not transformed into WRITE(TTY,X,Y). It is not clear whether this is a bug or not. i) This compiler attempts to check that assignments to subrange variables are within the subrange. It is possible to fool this test by using VAR parameters. These problems cannot be overcome unless there is some way for the compiler to tell which VAR parameters are intended as inputs to the procedure and which as outputs. j) The contents of unused bits in packed arrays and records is undefined. This should not cause trouble, except in programs the play fast and loose with variant records, or programs that pass arrays of type PACKED ARRAY OF CHAR to Fortran programs. Many Fortran programmers will use integer comparisons on character data, thus requiring the low order bit in the word to be zero. The code compiled in Pascal to compare PACKED ARRAY OF CHAR variables ignores the low order bit, so this does not cause a problem in Pascal. If you require unused fields to be zero in all cases, you can set /ZERO or (*$Z+*). k) Only the first 10 characters of identifiers are examined, so all identifiers must be unique to their first 10 characters. Note that the Revised Report only requires the implementation to look at the first 8. l) All of the entry points in the runtime library are effectively reserved external names. That is, if you use one of these names either as your program name (in the PROGRAM statement) or as the name of a procedure in a file of external procedures, disaster may result. Any name whose first six characters is the same as one of these names 75 will cause the problem. You can sometimes get away with violating this rule if your program does not use any of the features of Pascal which cause the particular runtime routine involved to be invoked. As of the time when this document was prepared, the following were the entry point names. For an up to date list, use the command "TTY:=PASLIB/POINTS" to MAKLIB: DEBUG, READC, READI, READPS, READR, READUS, WRITEC, WRTBOL, WRTHEX, WRTINT, WRTOCT, WRTPST, WRTREA, WRTUST, ANALYS, APPEND, BREAK, BREAKI, CLOFIL, CLREOF, CORERR, CURPOS, DELF., END, ENTERC, ERSTAT, GETCH, GETCHR, GETFN., GETLN, GETLNX, GETX., ILLFN, INXERR, LEAVEC, LSTNEW, LSTREC, NEW, NEWBND, NEWCL., NEWPAG, NEXTFI, NORCHT, PASIN., PTRER., PUT, PUTCH, PUTLN, PUTLNX, PUTPG, PUTPGX, PUTX, QUIT, RELF., RENAME, RESDEV, RESETF, RETPAG, REWRIT, SETPOS, SRERR, UPDATE, SAFBEG, SAFEND, STSET., STWR., PSIDEF, PSIDIS, PSIENA, CURJFN, FROM6, SETJFN, TO6, DATE, .STCHM m) The compiler does not enforce the restriction that a FOR loop body may not change the controlled variable. The following statement is illegal, but will compile under this compiler: FOR I := 1 TO N DO I := I+1 It will do every other value of I. 5.2 Use of DDT It is possible to use regular DDT to debug a PASCAL program. To do so, use the monitor DEBUG command with the switch /DDT after the first file name. If you run LINK explicitly, type /DEBUG as the first command, as usual. It is also possible to have both PASDDT and DDT in core at the same time. To do so, you should load the file SYS:PASDEB with your program, e.g. "EXEC SYS:PASDEB.REL,PROG.PAS". PASDEB has the appropriate garbage in it to load the right files in the right order. When loading is finished, DDT will be started. You may examine things and set breaks using DDT. If you decide you will want any breaks using PASDDT, you should then use the command "PASDEB$G" in DDT. This will set things up so when you start your program you will get the usual "Stop at main BEGIN". To start your program type "$G". By the way, be sure not to use the DEBUG command when loading PASDEB, as you will get two copies of DDT! In DDT, you will find that there are a few symbols defined for you. 76 The beginning of your main program is defined as a global symbol. Each procedure has up to three symbols defined for it. Assume that your procedure is called NAME. Then we have NAME the first part of the procedure proper. This is an appropriate place to put a DDT break point. NAME. the first instruction of a sequence of code used to adjust the static display pointer. It is located before NAME Most procedure calls are to NAME.+, rather than to NAME NAME% the first location of a block of byte pointers associated with this procedure. This is located before NAME. 5.3 Arithmetic errors The PSI system is always turned on. Currently the only channels used are the two arithmetic overflow channels (6 and 7). These channels are enabled if and only if /ARITHCHECK is specified with the program. (By default /ARITHCHECK is set if /CHECK is, and the default for /CHECK is on.) Thus if /NOARITHCHECK is specified, the PSI system will be turned on, but no channels will be enabled. (It is turned on in this case for consistency, in case the user wants to use PSI.) The effect of all of this is that when an arithmetic error (divide by zero, overflow, underflow) happens, an error handler will be called if /ARITHCHECK is in effect. This will print an error message, and call DDT or PASDDT if loaded. Otherwise it will then exit. 5.4 Interrupt handling Users may declare a Pascal procedure as an interrupt handler. To do so, call PSIDEFINE(chan,level,proc). Chan is the interrupt channel, 0 to 35. Level is the interrupt priority level, 1 to 3. Proc is the procedure to call when the interrupt occurs. PSIDEFINE must be 77 declared external: Procedure PSIDEFINE (chan, level: integer; procedure proc); EXTERN; The procedure called can do anything any Pascal procedure could normally do, except that it should not refer to any variables outside itself except top-level globals (variables not in a procedure but defined at the beginning of the program or file of external procedures). This restriction is not enforced, but violating it could result in disaster, and certainly will result in junk. The procedure may have from 0 to 5 arguments. If they exist, they are as follows: The first argument is set to the channel on which the interrupt occurred (in case the same procedure is associated with more than one channel). The second one is set to the value of PC when the interrupt occurred. The right half is the address to which control will return, and the left half is status bits. The third argument is a Boolean variable that is normally false. It is true if this interrupt really occurred in a critical section and was deferred. If it is set, the PC is not the PC where the interrupt really occurred, but rather is in the routine that reinitiates the deferred interrupts when you leave the critical section. (See below for an explanation of critical sections.) The fourth argument is the old PC passed by reference. Only hackers should touch this. It allows you to change the location that the system will return to after the interrupt routine. The fifth arguments is the array where the AC's are stored, passed by reference. This would let you change the contents of the AC's, e.g. to change the result after an arithmetic exception. This is also for hackers only. Most users will use at most two arguments. Other information may be obtained by appropriate jsys's. If the user sets up an interrupt for channel 6 or 7 (arithmetic errors), the interrupt will be "censored" so that errors occurring in the runtimes will not be seen. The user interrupt will preempt the normal Pascal arithmetic error handler. 78 There are also procedures for enabling and disabling interrupt channels: psienable(chan) and psidisable(chan). Psienable must be done before interrupts can be received on the corresponding channel. They do the aic and dic jsys's respectively. Note that the normal Pascal initialization sequence enables the interrupt system as a whole and sets up the vector, so only the individual channels need to be enabled. Psienable and Psidisable must be declared as EXTERN. The Pascal runtime system is designed to be as reentrant as possible. This is why you can do almost anything at interrupt level without worrying about what was going on when the interrupt occured. The known exceptions are - You should generally not do I/O on the same file at interrupt level and in the main program, unless you know that you will never interrupt an I/O operation on that common file. The one exception is the predeclared file TTY, which we take great pains to handle in a reentrant manner. (That is, you can do I/O with TTY in both the main program and at interrupt level.) - You should be careful about referring to common (global) data. - If you are using DISPOSE, you should be careful about NEW and DISPOSE. The memory manager used by NEW and DISPOSE is not reentrant. So if you happen to interrupt a DISPOSE and do another DISPOSE at interrupt level, the heap is likely to become corrupted. Any of the following things will eliminate this danger: - If you are not using DISPOSE. In this case you get a simpler memory manager for NEW, and it is reentrant. - If you know that an interrupt will never occur during NEW or DISPOSE. - If you do not use NEW or DISPOSE at interrupt level. If none of these things is true, then you are going to have to put the NEW and DISPOSE that occur in the main program inside critical sections. 79 5.5 SYNCHRONIZATION (critical sections) Ocasionally it is necessary to manipulate the same global data in both the main program and an interrupt routine. When this is done, it is often necessary to be sure that the interrupt routine does not break into the middle of the main program's manipulation. To attack this problem, we have added two synchronization primitives. These allow you to declare any section of your program to be a critical section. Within a critical section any interrupts that occur are deferred until you leave the critical section. To begin a critical section, call ENTERCRIT. To end a critical section, call LEAVECRIT. These are procedures with no parameters, and should be declared EXTERN. ENTERCRIT is vaguely equivalent to turning off the interrupt system, except that it is much faster than doing that jsys. When you are in a critical section, all interrupts are immediately dismissed, but the runtimes keep track of which channels have had interrupts occur on them. When you do LEAVECRIT, the deferred interrupts are simulated by the IIC jsys, higher priority level channels being done first. If several interrupts occur on the same channel during the critical section, only one interrupt will be generated at the end. This is consistent with DEC's design decision that programs can never assume that there are the same number of interrupts as causes of interrupts. Your code must check all possible causes of the interrupt, and continue to process them as long as any show up. Note that you probably cannot defer interrupts on the "panic channels". If one occurs during a critical section, you may get the usual error message from the EXEC. Note that memory allocation and deallocation in the runtimes are done within critical sections. This is what allows you to open and close files in interrupt routines. 5.6 Warning about error messages If you handle your own errors, you should not assume that the error code you get was necessarily generated by the monitor. Ocasionally the runtimes detect some problem and generate what seems to be an appropriate code. At least the messages seem appropriate for the circumstances. 80 5.7 Interfacing to external procedures in MACRO This section discusses the structure of MACRO routines designed to be called from as PASCAL program. Such routines will require a declaration within the PASCAL program, with EXTERN used in place of the body. EXTERN causes the compiler to expect a routine that uses the PASCAL calling conventions, so those will be discussed here. Should you prefer to use the Fortran-10 calling conventions, the routine should be declared EXTERN FORTRAN. The calling conventions are similar for both procedures and functions. The only difference is that functions return values, and procedures don't. In both cases, the arguments are put in accumulators 2 through 6. There is a way to pass more parameters than will fit in these accumulators, but it is fairly complex to explain. Should you need to do this, you are probably best to look at the code produced by the compiler (using /OBJECT). What is put in the accumulators is determined as follows: by value, one word - the value in one accumulator by value, two words - the value in two successive accumulators by value, more than two words - address of object in one accumulator by reference (VAR) - address of object in one accumulator Your routine may use the accumulators freely, except for 15, 16, and 17. 15 - this is a holdover from the tops-10 version. You should probably make sure that it is not changed. 16 - pointer to the base of the local variable area. This is in the stack below the current value of 17. All local variables of the calling routine may be accessed as positive offsets off 16. To find the offsets you will have to look at the object code, however. This value should be unchanged on exit from your routine. 17 - pointer to the top of the stack. You may use it in pushj 81 and push. Be sure that every push is matched by a pop, and pushj by popj. Note that the stack is at the top of allocated core, so it can expand almost indefinitely. If your routine is to be called as a function, it should move the result to 1(p). [That's right, folks, one above the top of stack.] You may call any PASCAL runtime routine with a simple pushj 17,. You may call any normal PASCAL-compiled routine with a pushj 17, but you should push a dummy argument on the stack first, as pascal routines garbage -1(17). 5.8 Special linkage conventions for hackers The following three identifiers function syntactically as if they were predeclared types. However they are only legal when used to describe parameters of EXTERN procedures. Thus they are a convenience for those brave souls who are trying to add more runtimes but do not want to have to modify the compiler. FILE - a parameter declared as FILE will match a file of any type. This is necessary for procedures such as CLOSE, RENAME, etc., which one obviously wants to work for files of all types. STRING - a parameter declared as STRING will match a packed array of CHAR of any length. This is used for the file name argument in RESET, REWRITE, etc. It actually puts data into two registers. The first gets the address of the array. The second gets its length in characters. This type of parameter only works with Pascal procedures. You can't pass it to Fortran, Cobol, or Algol. No error message will be generated if you try, but the results are garbage. POINTER - a parameter declared as POINTER will match a pointer of any kind. It is used for procedures such as NEW, which must work for pointers to any kind of structure. It also puts data into two registers. The first gets the 82 value of the pointer (or its address if VAR is used). The second gets the size (in words) of the structure that the pointer points to. This type of parameter only works with Pascal procedures. You can't pass it to Fortran, Cobol, or Algol. No error message will be generated if you try, but the results are garbage. Use of these things is strongly discouraged except by Pascal maintainers, who are assumed to understand what is going on. 83 I. References (1) N. Wirth. The Programming Language PASCAL (Revised Report) Bericht Nr. 5, Berichte der Fachgruppe Computer-Wissenschaften, ETH Zurich, November 1972 (2) K. Jensen, N. Wirth. PASCAL - User Manual and Report. Springer Verlag, Berlin, Heidelberg, New York, 1974. i Table of Contents 1. How to use PASCAL-20 3 1.1 How to use the normal compiler 3 1.2 PAS: A Special Compiler for Unmodified EXEC's 6 1.3 Core Allocation 8 1.4 How to Write a Program (Lexical Issues) 9 2. Input/Output 13 2.1 Standard Files 13 2.2 File Declaration 13 2.3 RESET and REWRITE (simple form) 15 2.4 Formatted Output 16 2.5 Reading characters 19 2.6 The Standard Files 21 2.7 Character Processing 22 3. Extensions to PASCAL 25 3.1 Input/Output to strings 25 3.2 Monitor calls 27 3.3 INITPROCEDURE 29 3.4 Extended CASE Statement 29 3.5 LOOP Statement 30 3.6 CLOSE, RCLOSE, and DISMISS 30 3.7 MARK and RELEASE 31 3.8 I/O facilities for wizards only 32 3.8.1 Extra arguments to RESET, etc. 32 3.8.2 Labelled tape processing 35 3.8.3 I/O Error processing 38 3.8.4 I/O implementation 39 3.8.4.1 Byte mode[1] 41 3.8.4.2 Pmap'ed mode[2] 41 3.8.4.3 TTY mode[3] 43 3.8.4.4 Null mode[4] 43 3.8.4.5 Buffered mode, by word[5] 44 3.8.4.6 Buffered mode, by bytes[6] 44 3.8.4.7 Record mode [7] 45 3.8.4.8 Other modes 46 3.8.5 A note on byte sizes in files 47 ii 3.8.6 Variable Record Formats 48 3.9 RENAME 50 3.10 DELETE 51 3.11 UPDATE 51 3.12 Random access 52 3.13 APPEND 53 3.14 Wildcards 53 3.15 Including external text 54 3.16 Miscellaneous I/O Functions 54 3.17 The structure of a PASCAL program 57 4. PASCAL Debug System (PASDDT) 59 4.1 Commands 59 4.2 Asynchronous Interrupt 65 4.3 Standard Procedures and Functions 66 4.4 External Procedures and Functions 70 5. Miscellaneous 73 5.1 Implementation Restrictions 73 5.2 Use of DDT 75 5.3 Arithmetic errors 76 5.4 Interrupt handling 76 5.5 SYNCHRONIZATION (critical sections) 79 5.6 Warning about error messages 79 5.7 Interfacing to external procedures in MACRO 80 5.8 Special linkage conventions for hackers 81 I. References 83