.TITLE GRELOC - LOCATE SEARCH STRING IN BUFFER .IDENT -010000- .ENABL LC ;+ ; Abstract: GRELOC ; ; This module attempts to locate the search string in ; The user buffer. If found, it returns with cc-c ; clear, and if it is not found, it returns with cc-c ; set. It supports a generalised search string, ; including the following constructs: ; ; ? - matches any single character. Must have ; exactly one character in the string matched. ; * - matches any set of zero or more characters. ; < - beginning of subargument ; > - end of subargument ; ^ - matches only the beginning of the input line. ; No characters are matched by this operator, ; only the beginning of the line. ; $ - matches only the end of the input line. No ; characters are are matched by this operator, ; only the end of the line. ; \ - escape next character from normal lexical ; meaning, and treat as a literal. ; any other character - is treated as a literal. ; ; The subargument construct will always match a single ; character, and the argument contained in the ; subargument is a list of the characters to be ; matched. A range of characters to be matched by the ; subargument can be specified by a "-" between the ; two characters delimiting the range. The $, ^, and \ ; characters are valid within a subargument, but the ; * and ? characters are not recognised with their ; normal meanings. ; ; Because of this, the following characters will (in ; general) need to be escaped with the \ operator: ; ; \? \* \< \> \^ \$ \\ \- ; ; examples of strings: ; ; search string matches ; ; <0-9> A1 A2 A3 F5 Z9 X0 ; * AXY XBT HHC RRA QQQQC ; *<0-9> ABC123 A1 ZYZ123 #%D6 ; ; Arguments: ; ; R1 -> data buffer to be scanned ; R2 = length of data buffer ; R3 -> search string to be matched ; R4 = length of search string to be matched. ; ; Outputs: ; ; R0-R5 are corrupted. ; ; cc-c is set if no match occurred. ; cc-c is cleared if a match occurred. ;- .MACRO FAIL SEC RETURN .ENDM .MACRO SUCCES CLC RETURN .ENDM ; .PSECT PDATA,RO,D ; CHTABL: .REPT 44 ; First 44 characters all are non-lexical .WORD CHAR ; .ENDR .WORD EOS ; $ - match end of string .WORD CHAR ; % - non-lexical .WORD CHAR ; & - non-lexical .WORD CHAR ; ' - non-lexical .WORD CHAR ; ( - non-lexical .WORD CHAR ; ) - non-lexical .WORD STR ; * - match a string .WORD CHAR ; + - non-lexical .WORD CHAR ; , - non-lexical .WORD CHAR ; - - non-lexical .WORD CHAR ; . - non-lexical .WORD CHAR ; / - non-lexical .REPT 10. ; numerics - non-lexical .WORD CHAR .ENDR .WORD CHAR ; : - non-lexical .WORD CHAR ; ; - non-lexical .WORD SUBARG ; < - begin subargument .WORD CHAR ; = - non-lexical .WORD CHAR ; > - non-lexical .WORD ANY ; ? - match any single character .REPT 27. ; @ and alphas - non-lexical .WORD CHAR ; .ENDR .WORD CHAR ; [ - non-lexical .WORD ESCAPE ; \ - escape next character .WORD CHAR ; ] - non-lexical .WORD BEG ; ^ - beginning of line .WORD CHAR ; _ - non-lexical .REPT 32. ; lower case alphas - non-lexical .WORD CHAR ; .ENDR .REPT 128. ; 200 - 377 - non-lexical .WORD CHAR ; .ENDR ; .PSECT IDATA,RW,D BEGLIN: .WORD 0 ; Beginning of input line. .PSECT GRELOC,RO,I GRELOC:: MOV R1,BEGLIN ; Save beginning of input line. GRELO2: LOOP: DEC R4 ; Is the search string done? BGE 10$ ;-cc no SUCCES ;-cc and return success. 10$: CLR R5 ; Clear a register BISB (R3)+,R5 ; Get a character without sign extend ASL R5 ; Shift by 2. JMP @CHTABL(R5) ; And dispatch to the character. ; ; Subprocedure to match a single literal character. It must ; match exactly one character, so it is not legal to match nothing. ; CHAR: DEC R2 ; Anything left in string? BLT 10$ ; No -- fail CMPB (R1)+,-1(R3) ; Does it match? BEQ LOOP ; Yes -- next character. BIT #SW.UPC,SWORD ; Match upper case? BEQ 10$ ; J if not. CMPB -1(R1),#140 ; Is it lower case? BLO 10$ ; J if not. MOVB -1(R1),R5 ; Into register. BIC #177440,R5 ; Fixup character. CMPB R5,-1(R3) ; Match now? BEQ LOOP ; J if so. 10$: FAIL ; Failure exit ; ; Subprocedure to escape the next character. This looks very much ; like the "char" procedure, except that it will consume two ; bytes of the search string rather than just one. Both will just ; consume one byte of the argument string. ; ESCAPE: DEC R2 ; Anything left in string? BLT 10$ ; No -- fail. DEC R4 ; Anything left in search string? BLT 10$ ; No -- fail. CMPB (R1)+,(R3)+ ; Compare the next character. BEQ LOOP ; And loop on success. 10$: FAIL ; Bad comparison -- failure. ; ; Subprocedure to match any single character. It must match ; exactly one character, so it is not legal to match nothing. ; ANY: DEC R2 ; Must have a char left in string. BGE 10$ ; Ok -- continue FAIL ; No -- error 10$: INC R1 ; Skip this character. BR LOOP ; And continue. ; ; Subprocedure to recognise end-of-string. This must never ; match any characters. ; EOS: TST R2 ; Must not have a char left. BEQ LOOP ; Success if so. FAIL ; Failure - not end of line. ; ; Subprocedure to recognise beginning-of-string. This must ; never match any characters. ; BEG: CMP R1,BEGLIN ; At beginning of line? BEQ LOOP ; Yes -- success. FAIL ; Failure - not beginning of line. ; ; Subprocedure to recognise subarguments. These are rather ; expensive to recognise, but their functionality is probably ; worth it. In any event, a subargument only matches a single ; character. ; SUBARG: CLR R0 ; No ranges in effect. 10$: DEC R4 ; Consume one byte of argument BLT 90$ ; None left -- fail. CLR R5 ; Clear out the register BISB (R3)+,R5 ; Move char into reg without sign extend CMPB R5,#'\ ; Escape character? BEQ 50$ ; Yes -- ignore lexical info. CMPB R5,#'> ; End of argument? BEQ 90$ ; Yes -- failure (no match found). CMPB R5,#'- ; A - (range specification)? BEQ 80$ ; Yes, process it. CMPB R5,#'$ ; A $ (end of string)? BEQ 70$ ; Yes, process it. CMPB R5,#'^ ; A ^ (beginning of line)? BEQ 60$ ; Yes, process it. 15$: TST R2 ; Do we have anything in string? BLE 10$ ; No -- just go on to next argument. CMPB R5,(R1) ; Match? BEQ 95$ ; Yes -- success. TST R0 ; Is this the end of a range? BGE 40$ ; No BIC #100000,R0 ; Show not to be a range any more. CMPB (R1),R0 ; Is it >= old character? BLT 40$ ; No CMPB (R1),R5 ; Is it <= current character? BLE 95$ ; Yes -- success. 40$: MOV R5,R0 ; Set the new character to look for. BR 10$ ; And loop. ; ; process \ (escape) character ; 50$: DEC R4 ; Decrement count in search string. BLT 90$ ; Leave if no more in string. CLR R5 ; Clear out the register again. BISB (R3)+,R5 ; Get the character without sign extend BR 15$ ; And re-join main line code. ; ; process ^ (beginning of line) character ; 60$: CMP R1,BEGLIN ; Are we at the beginning of the line? BNE 10$ ; No -- look at next character in arg BR 71$ ; And take success exit. ; ; process $ (end of line) character ; 70$: TST R2 ; Anything in string? BNE 10$ ; Yes -- then a $ doesn't match. 71$: MOV R0,-(SP) ; Save our current context. MOV R1,-(SP) ; ... MOV R2,-(SP) ; ... MOV R3,-(SP) ; ... MOV R4,-(SP) ; ... CALL GETEND ; Get end of current subargument. CALL GRELO2 ; Call ourselves recursively. BCC 77$ ; Skip on success. MOV (SP)+,R4 ; Recover registers. MOV (SP)+,R3 ; ... MOV (SP)+,R2 ; ... MOV (SP)+,R1 ; ... MOV (SP)+,R0 ; ... BR 10$ ; And try the next character. 77$: ADD #12,SP ; pop off saved context. SUCCESS ; And return success. ; ; process - (range) character ; 80$: BIS #100000,R0 ; Show that this is to be a range. BR 10$ ; And loop. ; ; handle various exit conditions ; 90$: FAIL ; Failure 95$: MOV R0,-(SP) ; Save our current context. MOV R1,-(SP) ; ... MOV R2,-(SP) ; ... MOV R3,-(SP) ; ... MOV R4,-(SP) ; ... CALL GETEND ; Get to the the end of the argument. INC R1 ; Skip past character we're looking at. DEC R2 ; Decrement char count. CALL GRELO2 ; Try to match remaining string. BCC 99$ ; Skip on success. MOV (SP)+,R4 ; Recover context. MOV (SP)+,R3 ; ... MOV (SP)+,R2 ; ... MOV (SP)+,R1 ; ... MOV (SP)+,R0 ; ... BR 10$ ; And try next character. 99$: ADD #12,SP ; Pop off the saved context. SUCCESS ; And return success. ; ; Internal subroutine of SUBARG to retrieve the end of the current ; subargument list. ; GETEND: DEC R4 ; Consume a byte of the argument. BLT 90$ ; Leave if none left. CMPB (R3),#'\ ; An escape character? BNE 70$ ; No -- process normally. DEC R4 ; Yes -- ignore any special characters BLT 90$ ; None left, just leave. ADD #2,R3 ; Increment past \ and thing after it BR GETEND ; And continue scanning for > 70$: CMPB (R3)+,#'> ; Is it a >? BNE GETEND ; No, just ignore it. 90$: RETURN ; And return to the caller. ; ; Subprocedure to match a new substring. It need not match ; more than zero characters, so it is legal to match nothing. ; This operation returns success if it had nothing left in the ; search string when it exhausted the user buffer, or if any of ; the substrings tried matched. It always returns success if the ; remaining length of the search string is zero. ; STR: TST R4 ; Is the search string done? BEQ 50$ ; Yes -- success immediately. 20$: MOV R1,-(SP) ; Save this position MOV R2,-(SP) ; And length MOV R3,-(SP) ; Save current context MOV R4,-(SP) ; In the search string CALL GRELO2 ; Try to match this portion. BCC 40$ ; If found a match, continue. MOV (SP)+,R4 ; Recover search string context MOV (SP)+,R3 ; ... MOV (SP)+,R2 ; Recover status MOV (SP)+,R1 ; ... INC R1 ; Increment input pointer. DEC R2 ; Consume a character from the input BGE 20$ ; Loop if still more in input. FAIL ; Otherwise, fail. 40$: ADD #10,SP ; Pop off status information. 50$: SUCCES ; And succeed. .END