RMS-11 Workshop Session + ______ ________ _______ Spring 1981 DECUS Symposium Miami, Florida Speaker: Tim Day - RMS Development Group Prepared Questions Answered: + ________ _________ _________ Q: An initial sequential GET by KRF field for an index file was done. The person then did a GET by RFA and RMS ignored the KRF field, the retrieve got the next record on the primary key. Why? R: This is a user error in that it is well documented that a GET by RFA has no meaning except in the context of the primary key. When you have done a GET by RFA you have set your context into the primary key. Any values placed into the KRF field are meaningless and ignored. Q: Why not place the index of a file on a separate device? This could be done by keeping the name of the index file in the prologue of the data file and opening it on a separate channel. R: This will probably not come to life because of the large amount of problems that can arise in maintaining, tracking, backing-up and recovering that form of a file. If someone wanted to be really tricky and modify the RMS code on their own, it MIGHT be doable. Doing this with a "shared file" would cause problems!! The reason for doing this is really for a performance increase, DEC's overall committment to performance is by other means such as buckets, areas and placement. Q: What is the best way to load an index file, last or first record first? R: If a file is loaded in descending order you will end up with a file that is longer than necessary with records not packed to the best they could be. The smallest, densest file will result from loading the file in ascending order. Q: Someone wanted a utility to compress a file on-line and in-place. R: No committment to this from DEC, although a show of hands showed an interest in it. Q: Another question was raised about "hashed indexes" for a file and its possible implementation in RMS. R: This is just another way to retrieve data from a file in a much faster way than by using indices as RMS does. The proper use of hashed indices should get ANY record at random in two (2) I/O's. In the current use of indices, it COULD take many more I/O's to a device. Questions From Attendees: + _________ ____ __________ Q: What about placing the indices into a temporary file when an OPEN is executed, possibly on a separate disk to increase performance? R: This might work for a file in a single-user per file environment. In a file openned for sharing, the indices would be extremely hard to maintain without corruption. If RMS were to become part of an operating system and have knowledge of everything going on, this might be feasible. Right now though, there are too many drawbacks to this idea. Q: What is the true story about reclaiming space left from deleted records in an index file? R: There are a lot of different types of deletes that take place. The worst case being that for a shared file with duplicate keys. The problem arises because RMS has to leave enough of the record to describe the primary key, which oddly enough is the only way RMS can tell that a record has been deleted. This is needed because another person may have your current record as his next record. Generally speaking, the remains (or fossils) of any deleted records are compressed in the buckets to leave a contiguous space which is then available to hold a record IF the record length PLUS any RMS overhead is smaller than the space available. The remains of deleted records will stay in the file for the life of the file or until it is reloaded, using IFL for example. Q: What methods are available for optimizing the structure of a file such as that found on the VAX? R: DEC is exploring the possibility of developing a utility to aid in defining a file with performance kept in mind. POINTS OF INTEREST: + ______ __ _________ Concerning ODL Files and Resident Libraries. + __________ ___ _____ ___ ________ __________ An ODL file is a way to do disk overlaying and under the RMS implementation, if you use a non-overlayed version, you are going to get between 7 and 44KB of RMS code/buffers added to your task space. In an overlayed environment, RMS will get down to about 10KB of your task space in some situations. This may not apply when using RMS from a High Leval Langauge, such as Fortran IV+. The above numbers were gotten from using MACRO-11 assembly language. The disadvantages of disk overlays are many. First, the program execution speed can be affected depending on the sequence of operations and the overlay structure itself. This is because overlay implies I/O. Second, optimizing an ODL file is plain hard work. Third, the task image on disk will grow depending on the tree structure defined. This may cause problems for those systems cramped for disk room. Fourth, the time needed to task build an overlayed program increases depending on the tree structure. The cure for many of these problems is the use of resident libraries. In terms of RMS, this means that you have a single, shared copy of RMS code placed into physical memory which all tasks are linked to at task build time. The first advantage is that there are no overlay I/O operations as all the library is resident together. Second, you don't need complex overlay files for RMS code. The ODL file that you would use is 3 lines long, which account for approx. 1.5KB of RMS code in your task image. Third, the task images on disk become smaller and the program should execute faster because of the lack of overlay I/O operations. The disadvantage to using libraries is that you give up 2 APR's (or 8KW of your task space), plus the little extra RMS code in your 24KW (remember you "lost" 2 APR's) task space. The breakeven point at which to change from having separate copies of RMS code in everyone's task space and installing a resident RMS library which will consume approx. 46KB of physical memory is four (4) simultaneous memory resident tasks using RMS. This would decrease your OVERALL memory requirements with a possible increase in execution speed. Goals of RMS. + _____ __ ____ The greatest goal of RMS is reliable tracking of a user's data. Considerable code was put into RMS to insure that data will not be lost or corrupted. This was done at the expense of execution time. Second, is a content addressable capability which generated the index file organization. Third, is the ability to have multiple indices. Fourth, good sequential access performance on the primary key. The overall structure of RMS was towards this goal. Fifth was fair to good access on alternate keys. This can be done by using bucket size, areas and placement. Sixth is Relative File Address (RFA) access which is guaranteed to be the same for the life of the file. Seventh is good space utilization within the file structure. An area for improvement is reclaimation of space left by deleted records. Areas. + ______ Even if you specify no areas, you are given an area zero (0), but you as a user don't know it. It won't even show when you "DSP /FU". The best conditions are if your file and, hence your areas, are contiguous. For the most part, the use of areas is a trial and error thing in search of the best performance. Proper use of areas will boost sequential access with little increase in random retrieval. The use of areas does not impact the user program, this frees you to try many schemes. DEF Utility. + ___ ________ Is there any way to back up to previous entry in the case of errors? There is a better (?) problem area in that DEF does not attach the terminal being used for input, which on a heavily loaded system, if DEF is checkpointed, any user input typed while DEF is checkpointed will be sent to MCR NOT DEF. Naturally, MCR does NOT know what a bucket is, you let your imagine take over from here. There is a patch in the works for this. In answer to re-entering previous parameters, a future release of DEF will confront this issue. CTRL-Z is the universal input to terminate an RMS utility. Odds and Ends. + ____ ___ _____ If a user has a contigious file with a bucket size that spans physical disk blocks, RMS will issue a multi-block read/write to the disk ACP. This increases the performance of your program. DEC is looking at a means to zero out a file in preparation to re-populating it. This would allow a user to maintain the physical location of a file on a volume. This would be more controlled than deleteing and re-creating the file on a multi-user system.