THE PERFORMANCE LIBRARY 1.0 TIMING AND PERFO RMANCE EVALUATION SUBROUTINES During initial debugging and test ing of a program is is often desirable to be able to trace the pr ogram path. Simply knowing how many times each subroutine executes c an provide valuable information about the logical decisions being m ade within the program. A complete trace of program execution provides great detail about its inner workings. Once a program is working we can give serious thought to tuning it; that is, to m aking it faster. This leads to the question: "Where does this progra m spend all its time?". According to programming proverb, most pr ograms spend 95% of their time in 5% of their code, or figures to that effect. A series of measurements by Knuth ("An Empirical Study of FORT RAN Programs", Software-Practice and Experience, April, 1971) tend to confirm the gist of this proverb. We want to find out what small group of routines are taking most of the time, for they are the ones that shou ld be tuned. There are several subroutines in the PERFORMANCE l ibrary that permit us to trace program execution, find out what percentage of execution time is spent in each subroutine, and det ermine who many times each subroutine is called. By inserting the stateme nt CALL TIMER at the beginning of the program, the timing p ackage is loaded and activated. The executing program is interrupted at e very tick of the system clock, and the timing package uses the FOR TRAN traceback chain to see which subroutine is currently executing. When execution ends, the timing package prints a list of all routi nes that it saw and the number of times that each was seen, expressed as a percentage of the total number of samples taken. If the program was executing long enough on typical data, these percentages provide a good in dication of where the program spends its time. By inserting the statement CALL MEASUR at the beginning of the program, the above inf ormation as well as the the number of times each subroutine is called i s reported. Note: 1) At least TR:NAMES must be in effect for the performance package to work. This package uses the FORTRAN traceback chain. 2) For accu rate timing the program must have higher priority than any other user program running on the system when the test is don e. Samples are taken at every clock tick regardless of who has the CPU; if the THE PERFORMANCE LIBRARY PAGE 2 task being studied is locked out then the timing result will not have much meaning. 3) T he task must execute for a sufficiently long time, say 3 second s (real time, not CPU time). The performance package calls sys tem subroutine USEREX. 5) Internally, TIMER uses subroutines T IMERX and PSECT (COMMON area) TIMERC. Internally, MEAS UR uses subroutine MESURX and PSECT (COMMON area) MESURC. By inserting CALL TRACE a complete trace of subro utine and function calls is generated. Internally TRACE uses the rout ines TRACEP, TRACES, TRACEM, and PSECT (COMMON area) TRACEC. Here are the performance routines with description. 1.1 TIMER SUBROUTINE TIMER(FNAME) TIMER is called once at the be ginning of execution to initialize and start the timing package. The argument FNAME is optional; if present, FNAME is an EOS-terminated string, providing a file name. This file name is remembered, and wh en the program exits, the timing information is written into the file. If the FNAME argument is omitted, the timing information is prin ted; that is, LP: is the default for FNAME. 1.2 MEASUR SUBROUTINE MEASUR(FNAME) The instructions for MEASUR are ide ntical to those for TIMER. The additional information for the number of times each subroutine is called is generated. THE PERFORMANCE LIBRARY PAGE 3 1.3 TIMEIN SUBROUTINE TIMEIN(RAD50('NAME')) For simple timing applicat ions, the TIMER or MEASUR subroutine is enough: TIMEIN is not used . TIMEIN gives control over what parts of the program are timed, but it requires more modification of the source program than if only TIMER or MEASUR is used. The call CALL TIMEIN(RAD50 ('NAME')) causes subsequent clock ticks to be recorded under the n ame "NAME" for either TIMER or MEASUR. To go back to the normal recording of clock ticks, the program executes CALL TIMEIN There are no arguments for this call. When is this used? S ay we wished to find out how-much time our program spends in the low le vel FORTRAN input routines. If our program is organized so that the re is only one READ statement, that statement can be preceded and fol lowed by TIMEIN calls like this: CALL TIMEIN(RAD50('$FOR TI')) READ (LUNIN,10) INPUT CALL TIMEIN Now all the samples taken while the program is executing under the FOR TRAN input routine are recorded under the name "$FORTI". When timing info rmation is listed, the percentage associated with $FORTI tells how much execution time the program spends in low-level FORTRAN read operation s. Double buffering or perhaps MACRO input routines can be used to shri nk this percentage. 1.4 TRACE SUBROUTINE TRACE(FNAM E,IUNIT,NUMBER) TRACE is called once at the beginning of exec ution to initialize and start tracing. FNAME is a legal fil e specification. IUNIT is the unit number on which to write the trace. NUMBER the number of items per line to write. If IUNIT is the u ser terminal, and NUMBER is 1, then the name of each THE PERFORMANCE LIBRARY PAGE 4 routine is wr itten to the terminal as it is called. Increasing NUMBER provides a more compact display. A routine may be left out of the trace by compiling i t with "/-TR". 2.0 BUILDING THE PERFORMANCE ROUTINES INTO YOUR TASK The performance routines are found in the library [1,1]PERLIB. By appending "[1,1]PERLIB/LB" to the list of files speci fied to the task builder, the necessary routines will be included in y our task 3.0 TECHNICAL DETAILS These details are for those who wish to understand or modify these routines. The source and command files included with the distribution provide additional details. To implement the subroutine call counting and the trace, a p atch was created to the $NAM routine from the Fortran OTS. When $NAM is c alled to enter a new routine in the traceback chain, a call is made by the patch to the GLOBL entry point $COUNT which is found in MEASUR, TRACE M, and TIMER. (for the last, a RETURN is immediately performed). The routines TIMER and MEASUR are both generated from the source file PERFRM.MAC. Conditional assemble parameter, MEAS, determines whi ch is generated. TIMERX and MEASURX are both originally generated from the Ratfor source file, PERFRM.RAT, as determined by the conditional parameter, MEAS. Fortran version of these routines are provided for t hose who don't have Ratfor. The original Ratfor will be found to be much e asier to read. The routines TRACE, TRACES, and TRACEP, are in t he file TRACE.RAT. A Fortran file, TRACE.FTN, is also provided. The GLOBL entry, $COUNT, is found in TRACEM. Control is passed to T RACEP. The rest is all in Ratfor or Fortran. You may feel free to write your own replacements. Your routine TRACEP will be called each tim e a new subroutine is called and handed the name of the subroutine. Do wh at you wish. A flag is set by TRACEM to prevent the code from being reen tered until a RETURN from TRACEP is executed. INDEX Building Performance routines into your task 4 MEASUR . . . . . . . . . . . . . . 2 Technical Details . . . . . . . . . 4 TIMEIN . . . . . . . . . . . . . . 3 TIMER . . . . . . . . . . . . . . . 2 Timing and performance evaluation subroutines 1 TRACE . . . . . . . . . . . . . . . 3