SUBREF.DOC DG 14-MAY-84 David Gaudine Concordia University, AD-532 Montreal, P.Q., Canada H4B 1R6 (514) 482-0320 EXT. 485 SUBREF is a cross reference and overlay generation program for FORTRAN programs. Several types of cross reference maps may be obtained, as well as 2 different overlay structures. Under RT-11 no knowledge whatsoever of overlays is required, since an efficient overlay structure is produced in command file form. Under RSX it is necessary to type in the ODL file based on the SUBREF output, use a one-level cotree for each region, and handle placement of common blocks and library routines. A commented sample run is shown below. Limitations ----------- 1) Because SUBREF doesn't examine array declarations, it can't distinguish array references from function references. When SUBREF sees one of these it assumes that it's a function reference, and looks for a function of the same name. If there is no such function then the reference is ignored since it is either an array reference or a reference to a function that is not in the input and hence will not be overlayed. The bottom line is that if you have a subroutine or function "ABC" and an array "ABC" then all references to the array "ABC" could cause problems. In some cases SUBREF may believe that recursion is taking place, as in the following case; FUNCTION ABC () CALL DEF RETURN END SUBROUTINE DEF REAL ABC(5) A = ABC(1) RETURN END It appears (to SUBREF) that ABC and DEF call each other. A suitable message may be printed. In this case try both overlay structures. The standard overlay structure option will give a message indicating recursion involving one or more subroutines, the last one listed is usually the most relevant and coresponds to "ABC" or "DEF" above. The optimized overlay structure gives a similar message, usually naming only one routine. The "extended calls map" option may also give useful output (but will take much longer than usual to be generated, since recursion causes 4500 lines of output to be computed). 2) Arrays are dimensioned for a maximum of 400 routines, which should be sufficient. 3) The arrays used for the "extended calls map" are dimensioned for a 4500 line maximum output. This is often insufficient but can't be helped, besides, think of the paper you're wasting. For small programs, reaching this limit implies recursion. Input Files ----------- The input for this example is contained in 3 files. The main program is in STEST.FOR, one subroutine is in A.FOR, and 2 more subroutines are in 'OTHERS.FOR'. Input file names are typed in response to the appropriate prompt, one per line, terminated by an empty line (extra carriage return). If there are several filenames then it may be more convenient to type them into a file, one per line, and type the name of this file preceeded by a '$' in response to the prompt. These two methods may be mixed freely. If the '$' option is used then an empty line must still be typed afterwards to terminate input. The default extension is ".FOR" for a source file or ".DIR" for a filename list file. A filename list file is typically created by a command such as; DIR *.FOR/FAST/COL:1/OUT:PROG and editing the result (which also has the extension .DIR). MACRO input files are not permitted. If a MACRO file is encountered then it is ignored and no harm is done. In order to get the more efficient of the two overlay structures it is necessary to generate one more input file. This file must contain the approximate relative sizes of the subprograms. The sizes of the main program and block data are not needed, although a warning is printed if they are not provided. The sizes can be obtained from a link map. The subroutine names and sizes are entered one (of each) per line in format (A6,I5). SUBREF will not read these from the map. The normal way of getting the values is to prepare a link command file which places each routine in a separate segment (arbitrarily, they can all be in region 1) and read the segment sizes (decimal words) from this. Sometimes a SYSLIB or other library function or a common block will appear in the root this way but will not be in the root for the final overlay structure. This is normally best ignored, but sometimes (rarely) it's useful to check since the optimized overlay structure may be affected. I use link version 5.04F, which is not the latest version. It was supplied with RT-11 version 3. Some (or all) later versions bomb out if there are a very large number of segments. The file BOMB.COM may be used to check a version of the linker. I haven't tried the linker supplied with RT-11 version 5. Unfortunately later versions of the linker also left justify the segment size which complicates editing the map to get the .SIZ file. Here are the source, .DIR, and .SIZ files used for the sample run. Note that the .SIZ filename is only prompted when the "optimized overlay structure" option is selected, and is not required otherwise. PROGRAM STEST ! STEST.FOR C C SUBREF test program C CALL A CALL B STOP END SUBROUTINE A ! A.FOR CALL B CALL C RETURN END Subroutine B ! OTHERS.FOR RETURN End ! (Note lower case is OK) SUBROUTINE C RETURN END A ! STEST.DIR OTHERS STEST 100 ! STEST.SIZ A 90 b 200 C 80 Here is the terminal I/O for the sample run, with comments added preceeded by a "!". Note that the default extension for output files is .REF . .RUN SUBREF SUBREF version 5B Enter filenames, one per line, blank line at end. The default extension is .FOR . Use $filename for a file containing a list of names. In this case, the default extension is .DIR . ? STEST ! STEST.FOR ? $STEST ! STEST.DIR A ! echoes filenames in STEST.DIR OTHERS ? ! no more input files PASS 1 ! looks for subprogram headers STEST ! echoes subprogram names A B C THERE ARE 4 ROUTINES. PASS 2 ! looks for calls and function references STEST A B C THE FOLLOWING ARE UNUSED ! unused routines will not be overlaid END OF UNUSED SUBROUTINES LIST Do you want the dummy program output? Y ! see below File name? F1 ! F1.REF Do you want the direct cross-reference listing? Y Device or file? F2 Do you want the standard overlay structure? Y Device or file? F3 RT-11 command file format? Y Do you want the expanded calls map? (LONG!) Y Device or file? F4 Do you want the indirect cross-reference map? Y Device or file? F5 Do you want the co-residency map? Y Device or file? F6 Do you want the optimized overlay structure? Y Enter the .SIZ filename, or HELP: HELP ! the only place that "HELP" works This option requires the existance of a SIZE file. This file must contain one line for each subroutine, consisting of the subroutine name and relative size in format A6, I5. The only currently known way to get the sizes is to do a link with every subroutine in a different segment and read the sizes on the map. The main program and block datas (if any) do not need to have their sizes specified. Normally the names still appear, but this is not essential. The default extension is .SIZ . Enter the .SIZ filename, or HELP: STEST Device or file? F7 RT-11 command file format? Y . Dummy Program Output -------------------- This is the simplest possible program which gives the same SUBREF output as does the original program. Its primary use is to allow repeating the SUBREF run without waiting while SUBREF reads all those silly comments and other unrelated statements. Also, if the "extended calls map" fails because the program is too big, then the dummy program can be edited to remove unwanted calls. SUBROUTINE A CALL B CALL C END SUBROUTINE B END SUBROUTINE C END PROGRAM STEST CALL A CALL B END Direct Cross Reference Map -------------------------- This map answers the ancient question "what is ABC called by, and what does it call?". It does not tell you what is called by the routine which is called by ABC. ROUTINE CALLED DIRECTLY BY ------- ------------------ A STEST B A STEST C A STEST (NO REFERENCES) ROUTINE CALLS DIRECTLY ------- -------------- A B C B (NO CALLS) C (NO CALLS) STEST A B Standard Overlay Structure -------------------------- The following information is provided for interested parties and is not needed in order to run the program. This overlay structure follows all the normal rules, i.e. subprograms in region 2 can call subprograms in the root or in region 3 but not in region 1. Subprogram size is not considered. The message at the beginning can be customized by editing RTLST.FOR . Each subroutine is placed in a different segment, so to optimize for speed it is necessary to combine segments that are likely to be swapped frequently. Generally it is best to combine small segments as well, which results in a smaller SAV file, slightly less memory used, and faster linking. ! Add any necessary libraries immediately before ! the "/I". If the program will only be run on the LSI ! then add SY:FORLIB.FPU as the last library. ! R LINK STEST=STEST/I/C A/C B/O:1/C C/O:1 $SHORT ^C EXPANDED CALLS -------------- The "expanded calls map" gives a graphical representation of what is called by a given routine. Its primary use is in optimizing a program for speed. The generated overlay structures yield programs which are small but (usually) pitifully slow. Looking at the sample map, it is clear that if the speed of subroutine "A" is essential and if B and C are in different segments in the same region then the segments including B and C should be combined. If B and C called anything else then these other routines would also have to be combined. Of course, if A calls B once and C 100 times then overlaying B and C against each other is fine, but if B and C are called alternately in a loop then the segments should be combined. If the map was greater than 4500 lines and was truncated, it is possible to generate a partial map by editing the "dummy program" output and removing unnecessary calls. In this example the call to B from STEST is unnecessary (when optimizing A) so the call to B can be removed. The calls to B and C from A must remain, and if B and C included calls then these would have to remain as well. LEVEL ROUTINE 1--> STEST 2--> A 3--> B 3--> C 2--> B Indirect Cross Reference Map ---------------------------- The indirect cross reference map differs from the direct one in that since STEST calls A and A calls C then STEST calls C. ROUTINE CALLED INDIRECTLY BY ------- -------------------- A STEST B A STEST C A STEST STEST (NO REFERENCES) ROUTINE CALLS INDIRECTLY ------- ---------------- A B C B (NO CALLS) C (NO CALLS) STEST A B C Coresidency Map --------------- This map is a combination of the 2 indirect maps, showing what routines may need to be in memory when any given routine is in memory. This is useful for debugging overlay structures, although if SUBREF is used to generate an overlay structure then no debugging should be necessary. ROUTINE CORESIDENT WITH ------- --------------- A B C STEST B A STEST C A STEST STEST A B C Optimized Overlay Structure --------------------------- The optimized overlay structure considers subroutine size, and totally ignores the usual restriction that a subprogram in one overlay region should not call any subprograms in lower region. The largest routine present is placed in region 1 segment 1, and the next largest is placed in region 1 segment 2 unless it calls or is called by the one in region 1 segment 1. This goes on down the list until there is nothing more to put in region 1, and then the largest one remaining goes in region 2 segment 1, and so on. The size of a region is determined by the size of segment 1 of that region, so if A is placed in segment 2 and A calls B then B may be placed in segment 2 as well if the total size of segment 2 will still be less than or equal to the size of segment 1. In this case both A and B must not be called from or call any routine in segment 1. As for the standard overlay structure, it is necessary to optimize for speed and to minimize the total number of segments, which must be done in that order. SUBREF could minimize the number of segments automatically but this would preclude optimizing for speed. A future version may accept an input file listing subroutines which the user desires to be coresident, so SUBREF can perform the speed optimization. Actually, 3 passes are made to try to find the best structure. During 2 of these passes segments may be combined even if the total segment size is slightly greater than the region size. The "optimized overlay structure" option requires several other options as intermediate results. If these options are not selected by the user then they will be computed automatically. Therefore the .SIZ file name may be prompted immediately after this option is selected or not for several minutes for different runs with the same input, depending on which options are chosen. ! Add any necessary libraries immediately before ! the "/I". If the program will only be run on the LSI ! then add SY:FORLIB.FPU as the last library. ! R LINK STEST=STEST/I/C A/C B/O:1/C C/O:1 $SHORT ^C