WTR - WORD11 to RUNOFF Converter ================================ Users' Guide ============ WTR is a small program to convert a WORD11 text file into its equivalent RUNOFF file. The WORD11 file should first be copied onto an RSX-11M disk in image mode (possibly using RTR). When WTR is run it will prompt for a filename with:- ENTER FILENAME (NO .EXT) > to which you should reply with the name of your WORD11 file (omitting the .WPS extension). The program will then create an output RUNOFF file with the same filename and an extension of .RNO. It also creates an output file with the same filename and an extension of .ERR containing details of all the errors/problems detected during the conversion. When it has finished WTR types a message on the user terminal saying:- WTR -- n ERRORS m SERIOUS. where an error represents anything that WTR could not handle, and a serious error is one where WTR had to omit some important information from the output file. WTR can only handle a fairly straightforward WORD11 file (for instance it cannot handle decimal-aligned tabs) but will output a message detailing anything it cannot cope with. The messages are:- 1. UNKNOWN CONTROL CHARACTER xx AT BLOCK n. OFFSET m A control character (i.e. one in the range octal 200-377) was detected but could not be recognised at block n (decimal) offset m (octal). This is a serious error because it probably means a WORD11 feature that the program has not been written to cater for. Determine from the WORD11 context what the character means and update the program. 2. TEXT STRING TOO LONG:- xxxxxxxxxxx A text string of more than 100 characters that could not be divided into words (i.e. there were no spaces) has been encountered. WTR will only handle lines up to 100 characters long (at the moment). 3. CANNOT HANDLE ^F STRING:- xxxxxxxxxx The ^F control character is used for printer control by WORD11 (e.g. top of page messages), and the only type that WTR can handle currently is ^FTOP, i.e. output a message at the top of the page. This is not a serious error as most other types are merely cosmetic. 4. TOO MANY TITLE LINES, LINE IGNORED:- xxxxxxxxxx RUNOFF only allows two title lines (in fact one Title and one Subtitle), whereas WORD11 will allow as many as you like. WTR ignores all titles after the first two lines. This is not a serious error. 5. TOO MANY ^ZS, ONE OMITTED IN LINE:- xxxxxxxxxx ^Z is the control character used to represent a TAB, and these are defined in a WORD11 ruler sequence. WTR has room only to store definitions for 15 TAB positions in a line, and will output this message if you try to use more. It is not a serious error. 6. ODD NUMBER OF CHARS IN TAB SEQUENCE IN RULER A ruler sequence had an odd number of characters defining the TAB sequence (there is usually an even number - two per tab). The last (odd) character is ignored. 7. CANNOT HANDLE TAB TYPE xxx WTR currently can only handle standard (i.e. left-aligned) TABs. If it encounters a different type in a ruler sequence it will output this message and assume a left-aligned TAB instead. WTR - WORD11 to RUNOFF Converter ================================ System Guide ============ WTR is a small FORTRAN program that simply reads through the input file, checking each character and either interpreting it (control character) or copying it unchaged into the output file. Control characters are distinguished by having the 8th bit set - i.e. when tested as a byte they appear negative. In addition the first four bytes of each block contain check information that can be ignored. If DMP is used to dump the WORD11 file in ASCII, the control characters are represented as their ASCII equivalents without the 8th bit set, and it is convenient to refer to them by that representation, rather than by their octal values. The control characters regognised so far are:- 1. ^@ (200) - Start of File This character indicates the start of the file proper. All information before it (and it itself) can be ignored. 2. ^B (202) - Start of Ruler Sequence This is obviously the most important part of WORD11 for decoding purposes. A Ruler Sequence contains a number of indicators in fixed positions, followed by a variable length string of Tab information, followed by a terminator. The sequence is as follows:- Char 1 - ^B (202) (Start of Sequence) Char 2 - Unknown, but seemingly unimportant Char 3 - Unknown, but seemingly unimportant Char 4 - Unknown, but seemingly unimportant Char 5 - Offset of Left Margin Char 6 - Paper Size (Lines per Page) Char 7 - Offset of Wrap Margin Char 8 - Offset of Right Margin Char n - %X (370) (Start of Tab Sequence) Char n+1 - Offset of First Tab Char n+2 - Type of First Tab (0 = Left-Aligned) Char n+3 - Offset of Second Tab : : Char m - %^ (376) End of Ruler Sequence The Start of Tab Sequence (370) seems currently always to be in Character Position 9, but presumably it is flagged so that it can be changed at a later date. All the values are stored internally, and in addition the routine outputs a .RM and a .PS line with the values given. 3. %_ (377) - Qualifier This is of no importance and can be ignored. 4. ^X (230) - Underline Next Character This is simply a flag to say that the next character should be underlined. WTR simply inserts the RUNOFF underline character (&) into the output buffer. 5. ^F (206) - Title/Footnote Text This flags the start of Title or Footnote text. WTR simply outputs the current record, and flags that it is in a TITLE sequence. 6. ^R (222) - New Line flag This means that the text contains a new line (carriage return) at this point. A 'special' use of this is in the Title/Footnote sequence, where the descriptor (TOP/BOTTOM) and each line of the Title/Footnote is preceded by a ^R. Thus the routine checks to see if it has just had a ^F and if so checks to see what sort it is (TOP/BOTTOM). If a TOP it flags it as such, otherwise it outputs a message and ignores the whole sequence. If it was a TOP then the next line is output as a .T line, and the one after that as a .ST line (and all further ones are rejected). If it is not doing a Title/Footnote sequence, the routine merely outputs the current record with appropriate margin control. 7. ^H (210) - End of ^F sequence This signals the end of a Title/Footnote sequence. The routine simply switches off the 'Title Sequence' flag and outputs the last title line (unless it has already done as many as it can). 8. ^T (224) - Centre Preceding Text This comes after the text that is to be centred. The routine simply outputs the text preceded by a .C (cannot be done if a title). 9. ^D (204) - New Page This represents a request for a new page, and the routine simply outputs a .PG line. 10. ^Z (232) - Insert a TAB This represents a TAB in the text. The routine searches the TAB definitions from the last ruler definition and advances the current record pointer to the offset of the next tab in sequence. If the pointer is already past the last defined tab this character is ignored. 11. $ (244) - End of File This represents the end of the text, and all following characters can be ignored. 12. . (256) - ? I cannot determine what this character means, but ignoring it seems to do no harm. Ordinary characters are simply copied, usually unchanged, into the output buffer. If the character is one of the RUNOFF special characters (&^\_#) it is preceded by an _ to indicate it is to be taken literally. (Note that because of this WTR cannot handle a line of underscores [_] for each one has to be represented as __ which will produce a total line length of 160 or so - more than WTR can handle.) In addition, if several spaces occur together they are replaced by # to stop RUNOFF compressing them. A check is kept on the line length, and the routine attempts to break the line at a 'reasonable' point, but will only do so at a space. This program is not as sophisticated as it might be, as shortly after writing the first version I decided I didn't really like RUNOFF. These days I create a print document from WORD11 (use the DT option in the PRINT Menu to direct the output to a file), copy it via PIP/FLX and then use TECO macros to do my margin adjustments in any future changes. WTR is a non-privileged FORTRAN IV-PLUS program, with a taskname of WTR and a MAXBUF of 512.