.RIGHT MARGIN 80.LEFT MARGIN 5 .FIRST TITLE .TITLE Sort-11 V3.0 performance .SUBTITLE B.#Z.#Lederman .PARAGRAPH The following is some of the data I obtained when testing some variants of Sort-11 V2.0 and V3.0. Statistics were obtained from Sort-11 itself, through System Accounting, SRMLOG (a system measurment program), and with SPM-11. .PARAGRAPH The first test shown is sorting a file containing 20,000 records of 64 bytes (2579 blocks), each record being a string of random numbers (the file is in "random" order), sorted with four keys (1.8:57.8:9.8:17.8). The different lines of data are for differnt numbers of work files: the first is with a large number of work files (5 for V2.0, 7 for V3.0), the second is with the default number of work files, and the third (when present) is for 3 work files. The three letters at the left is the task name that that version was installed under, which will be seen in other tables. .BLANK.NO JUSTIFY.NO FILL.TEST PAGE 30 Data from SPM System Elapsed Time CPU DB1 F11ACP Over- Accounting Version Type sec. sec. QIO QIO lays Directives ----------------------------------------------------------------------- .BLANK srt V2.0 as distributed 228 106 3149 31 88 9579 299 132 3687 89 199 11242 .BLANK srr V2.0 non-overlayed 237 116 3061 32 3266 resident library 306 141 3464 114 3954 .BLANK srs V2.0 overlayed 218 112 2413 35 7 2614 resident library 288 136 2780 122 7 3315 .BLANK sr3 V3.0 as distributed 228 144 2241 86 110 2697 221 143 1832 72 88 2242 256 160 1971 51 64 2329 .BLANK sr4 V3.0 resident library 211 145 1531 99 19 1947 211 146 1351 87 19 1740 239 161 1420 77 19 1736 .BLANK sri V3.0 I/D space, non- 208 144 1398 101 1787 overlayed, 209 146 1247 114 1626 resident library 238 161 1298 104 1652 .BLANK srj V3.0 as above, larger 200 142 1176 96 1568 work space 204 143 1042 115 1432 231 159 1086 131 1490 .BLANK.JUSTIFY.FILL When the test is run on an idle system (Sort was the only task of significance running at the time), there does not appear to be a great amount of difference beween versions. It can be seen that use of the resident libraries reduces overlays, and reduces the number of disk I/O's (there is more task image space for work area, so less disk work space is required), but task elapsed time isn't much improved. In the case of V2.0, building the task non-overlayed is worse than the overlayed version: this is because only a few overlays are needed (once the RMS overlays are taken out), and because less task work area is available so more disk work area is used. Of the three, the overlayed task linked to the RMS supervisor mode library places the least load on the system, but this has little effect on an idle system. Comparing V3.0 with V2.0 as distributed, there is a considerable reduction in the number of directives and QIO's, but little improvement elsewhere. Linking to the RMS resident library produces similar benefits as were obtained with V2.0, and the resulting task is now running elapsed times comparable to V2.0 (note that the actual percentage difference between all of the tasks is small). With V3.0 however, a further improvement is obtained by building the task non-overlayed, as it may be built as an I-and-D space task: this yields a net increase in task work space, rather than the loss that occurred with V2.0 which cannot be built I-and-D space. (There is so much more code with V3.0, primarily command parsing, that it cannot reasonably be built non-overlayed without making it I-and-D space.)# Now, the task has both increased internal work space and also has no disk overlays, and this can be seen from the even smaller numbers of QIO's and directives issued. Still, on an idle system, the net elapsed time is not changed much, due to waiting times for disk I/O. It might also be noted that V3.0 seems to be a little smarter about choosing the best number of work files, and in using them effectively. .PARAGRAPH The above test was repeated, but with 10 other programs running which were created to load the CPU and disk to create an environment similar to what we have on our production systems. The load from these programs is constant and predictable, however, so that testing the different versions of Sort would be valid. The data file was reduced to 15,000 records (1875 blocks). .BLANK.NO JUSTIFY.NO FILL.TEST PAGE 17 Data from SPM Elapsed Time CPU DB1 F11ACP Over- Version Type sec. sec. QIO QIO lays ------------------------------------------------------------ .BLANK srt V2.0 as distributed 332 108 2732 129 193 .BLANK srr V2.0 non-overlayed 345 127 2532 137 .BLANK srs V2.0 overlayed, reslib 304 119 1868 124 .BLANK sr3 V3.0 as distributed 254 132 1328 81 88 .BLANK sri V3.0 I/D space 242 133 867 115 .BLANK srj V3.0 as above, larger 238 131 696 127 .BLANK.JUSTIFY.FILL Now that there are other programs on the system competing for CPU time, and more importantly, disk I/O, the improvement obtained is more obvious. Use of the resident library has improved V2.0 elapsed time by almost 10%; V3.0 is now somewhat faster than V2.0, and a small but significant improvement can be seen in the non-overlayed versions of V3.0 over the distributed version. By reducing the number of QIOs, there is less competition for disk access, and reducing the number of directives issued means less CPU time is taken up by the operating system, and allows the program to get more work done within it's time slice. Increasing the internal work space also allows the program to waste less time and disk resource by not having to reference the disk work files as often. .PARAGRAPH Note that all of the testing was done with RECORD sorts: I have not tested other types of sorts, as we do not use them much here. I have tested the different versions with different data files: with small (less than 3000 record) files, there is less difference between the different versions. With very small files (less than 100 records), V3.0 is usually a little slower: apparently it takes longer to parse out the commands (there are more of them, especially with specification files) and initialize the task than V2.0 did, but usually the difference is very small, and can only be seen by measuring it (it does not appear to a person sitting at a terminal that the sort took longer). We have settled on V3.0 linked to the RMS library, but overlayed: we would like to go to the I-and-D space version, but there are other problems (like the inability to install it in VMR) that prevent us from doing so at this time.