. Insert the following two lines when including in UNIX Manual
.	.ds CH "CTRACER
.	.ds CF "- 5/% -
. Insert the following line for printing on A4 paper
.	.pl
. Insert the following two lines when typing on diablo
.	.nr LL 80
.	.nr LT 80
.de SS
.RS
.RS
..
.de SE
.RE
.RE
..
.TL
C-TRACER: a symbolic debugging facility
.AU
(preliminary description)
.AU
Paul Klint
.AI
Mathematisch Centrum
Amsterdam
.SH
Introduction
.PP
C-TRACER is developed for the debugging of programs written in the
language C.
Unlike the existing debugging tools (DB and CDB)
C-TRACER does not use post-mortem dumps, but is active during the
execution phase of a user program.
It is only required to compile the program under a special
option:
.DS L
	cct prog.c
.DE
The C-compiler now generates "breakpoints" in the object code, i.e
points where during the execution of the traced program control
.ul
may
be passed to the user to inspect and modify values of variables and the like.
The compiler generates breakpoints at the following places:
.SS
.IP -
before all tests (in while, for, if, do and switch statements).
If we indicate a breakpoint by "#i#" where i is an integer,
then breakpoints are generated at the following places:
.RS
.DS
if(#1#a>b) ...
while(#2#a>b) ...
for(n=0;#3#n<10;n++) ...
switch(#4#n) ...
do ... while(#5#n>0)
.DE
.RE
.IP -
at the beginning of a compound statement immediately after '{';
in this way extra breakpoints may be introduced by adding braces
(in the source-text!),
but in practice one never needs this escape mechanism.
.IP -
after a label.
.IP -
at routine entry.
.IP -
before return from a routine.
.SE
.LP
In general not more than one breakpoint is generated per line
in the source-text of the traced program.
.SH
Global mode of operation
.PP
As soon as a breakpoint is reached during the execution of a traced
program, the name of the currently active routine and the line-number
of the line containing the breakpoint are printed on the standard
output file (the terminal in most cases):
.DS
<<<main:6
.DE
Line-numbers are local to routine declarations; the line containing the
opening '{' is given line-number one.
Conversation at a breakpoint starts with "<<<" and is closed by
">>>" to make a clear distinction with normal output of the traced program.
C-TRACER is now ready to accept commands like:
.SS
.IP -
Terminating the conversation at the current breakpoint (by means of
an empty line followed by a
newline character) or program execution (by means of the EOT character).
.IP -
Displaying (or changing) values of variables, array-elements, pointers and so forth
.IP -
Stating the conditions a following breakpoint must satisfy to give
control to the user.
Conditions consist of
.RS
.IP -
routine name and line range within that routine, or
.IP -
relations between variables in the traced program.
.RE
.sp 1
If no condition is specified control is given at the next breakpoint.
.SE
.LP
In the following sections all C-TRACER commands will be discussed
in more detail.
.SH
Inspection of values
.PP
The value of an in C addressable entity can be inspected by presenting
the "symbolic address" (or lvalue in C terminology) of that entity
followed by a newline character as a command to C-TRACER.
Consider the following example, where
the i-th breakpoint is indicated by "#i#":
.DS
main(){
	int i,x,ar[5],*pt;
	char ch;
	float fl;
#1#	x = 3; pt = &ar[1]; ar[1] = 17;
#2#	ch = 'a';
#3#	fl = 3.5;
	for(i=0; #4# i<5; i++)
		ar[i] = i;
}
.DE
Suppose the for-loop is executed once and #4# is reached for the
second time.
The following conversation could now take place (input is underlined and
the newline character is indicated by '^'):
.DS
<<<main:8
.ul
x^
 ... 03 3
.ul
i^
 ... 01 1
.ul
ar[1]^
 ... 021 17
.ul
*pt^
 ... 021 17
.ul
ch^
 ... 0141 'a'
.ul
fl^
 ... 3.500000
^
>>>
.DE
.PP
For objects of type
.ul
int
an octal and a decimal value are given,
For objects of type
.ul
char
an octal value and an ASCII equivalent are given, provided that
the value corresponds with a printable ASCII character.
If this is not the case, the ASCII equivalent is replaced by '?'.
For pointers of some type,
the value of that pointer is given in octal, prefixed by "(adr)".
For objects of type
.ul
struct
an error message is given, because the information about the composition
of a struct is not available;
fields of structs must be specified explicitly by the user,
i.e. to inspect the fields of struct {int n; char c;} x;
one must ask for x.n and x.c explicitly.
.SH
Inspection of arrays
.PP
A special facility exists in C-TRACER to avoid tedious retyping,
if successive array-elements must be inspected.
A command with syntax
.DS
lvalue / [ rvalue_1 ] , [ rvalue_2 ]/
.DE
is used for this purpose.
All the non-alphanumeric symbols
in the line above are delimeters typed in at the terminal.
The effect is that the values of lvalue[rvalue_1], lvalue[rvalue_1+1], ... ,
lvalue[rvalue_2] are displayed.
If array[3], array[4] and array[5] contain 23, 24 and 25 respectively,
then these values can be obtained by the command:
.DS
array/[3],[5]/
.DE
to which C-TRACER responds with:
.DS
[3] ... 027 23
[4] ... 030 24
[5] ... 031 25
.DE
.LP
Rvalue_1 must be less then rvalue_2 and the type of the lvalue must be
an array of some kind.
The same rules as for normal inspection of values apply.
.SH
Modification of values
.PP
During the conversation at a breakpoint it is also possible to
modify values by means of assignments which resemble very much
C assignment statements.
If we stick to the same example, the following dialogue is
possible:
.DS
<<<main:8
.ul
x^
 ... 03 3
.ul
x=8^
 ... 010 8
.ul
x^
 ... 010 8
.ul
pt=&ar[3]^
 ... (adr) 0177732
.ul
pt^
 ... (adr) 0177732
.ul
*pt=29^
 ... 035 29
.ul
*pt^
 ... 035 29
.ul
ar[3]^
 ... 035 29
.ul
ch = 'b'^
 ... 0142 'b'
.ul
ch^
 ... 0142 'b'
^
>>>
.DE
In general an arbitrary rvalue, i.e. an expression resulting in a
value, is allowed on the right-hand side.
Observe that C-TRACER displays the value that is assigned to the
left-hand side of the assignment.
In appendix A the precise syntax of lvalues and rvalues is given.
.SH
Breakpoint conditions
.PP
Two types of breakpoint conditions exist:
.ul
line-conditions
and
.ul
stop-conditions.
The former deal with the specification of a certain area
in one named routine where breakpoints can become active;
the latter can be used to specify an arbitrary relation between
variables in the traced program which must hold,
before some breakpoint can become active.
.SH
Line-conditions
.PP
A command with
the following syntax is used to indicate an area in a routine:
.DS
	either:	/ name /
	or:	/ name / from /
	or:	/ name / from , to /
	or:	/ name / , to /
	or:	/ name / from, to / repeat 
	or:	any of above forms followed by a repeat count
.DE
If we follow the components of the command from left to right,
the overall meaning of this statement is:
.SS
.IP IF
the currently executing routine has the name "name"
.IP AND
the breakpoint under consideration lies in a line with line-number
between "from" and "to" (bounds included)
.IP AND
this is the "repeat"-th time that this event happens
.IP THEN
this breakpoint may become active.
.SE
.LP
Here follows a more extensive explanation of the components of a line-condition:
.SS
.IP name:
.br
A C-function name. Though "name" may contain more than 8
alphanumeric characters or underscores, only the first 8
are significant.
If "name" is absent the previous name is used.
Initially "main" is used.
.IP from:
.br
Rvalue giving start of line-range.
If "from" is absent the previous value is taken.
Initially the value is 1.
Remember that line-numbers are local to routine declarations.
.IP to:
.br
Rvalue giving the end of the line-range.
If the comma is present but "to" is missing the previous
value is used.
Initially the value is 1000.
If both comma and "to" are absent the value of "from" (as defined above)
is used.
This situation occurs frequently if control is required in one specific
line.
.IP repeat:
.br
Rvalue giving repetition count.
If "repeat" is absent the previous value is used.
Initially the value 1 is taken.
The repeat mechanism is powerful but obscure.
Its major application is to get control when a specific line
has been executed "repeat" times.
However, if the line-range contains more lines, every breakpoint
in this area which is reached during
execution will decrement the repeat counter
and thus the number of breakpoints in the line-range must
be taken into account carefully.
.SE
.LP
When a line-condition is given to C-TRACER it responds with its
interpretation of that command.
Some examples:
.DS
.ul
/fun/^
fun from 1 to 1000
.ul
//7,10/^
fun from 7 to 10
.ul
//,50/^
fun from 7 to 50
.ul
//40/^
fun from 40 to 40
.ul
///5^
fun from 40 to 40 repeat 5
//^
fun from 40 to 40 repeat 5
.DE
The user is responsible for the occurrence of at least one
breakpoint in the specified line-range.
This is not checked by C-TRACER.
A non-satisfiable line-condition results if the line-range
does not contain any breakpoints.
The specification of a line-condition does not terminate the
conversation at the current breakpoint.
A newline character is still required to end the dialogue and
activate the line-condition.
.SH
Stop-conditions
.PP
The most powerful feature of C-TRACER is provided by 
the so-called stop-conditions.
In general a list of relations is given to C-TRACER which must hold
before any breakpoint may become active.
Some examples of legal stop-conditions are
.DS
(x>=0)
(x>=0 && c== 'a')
(x>=0 &&(c == 'a' || c=='b'))
.DE
The syntax of stop-conditions is:
.DS
stop_condition
	: ( relation_list )
	.
.DE
.DS
relation_list
	: ( relation_list )
	| relation_list boolean_op relation
	| relation
	.
.DE
.DS
relation
	: lvalue relational_op rvalue
	.
.DE
.DS
boolean_op
	: ||
	| &&
	.
.DE
.DS
relational_op
	: >
	| <
	| >=
	| <=
	| ==
	| !=
	.
.DE
.PP
An explanation of the precise meaning of
stop-conditions will now be given.
A stop-condition is activated at the conclusion of the conversation
in which it was specified.
From that moment on, C-TRACER tests at every breakpoint encountered
during program execution whether individual relations hold.
The first breakpoint where the whole stop-condition is satisfied
is displayed and control is given to the user.
.PP
Internally C-TRACER turns lvalues into addresses and rvalues into
constants.
One of the consequences of this approach is that lvalues which themselves
contain variables are "frozen".
If "n" has value 3 at the moment the following stop-condition is
specified, than
.DS
(array[n] == 5)
.DE
will test whether array[3] equals 5;
even if during program execution, after the specification of this
stop-condition, the value of "n" is changed.
If "m" equals 4 then
.DS
(array[n] == array[m])
.DE
compares array[3] with array[4].
.PP
There are 3 circumstances under which a stop-condition is de-activated
and removed:
.SS
.IP -
The stop-condition is satisfied;
control is given at the first breakpoint where this occurs.
.IP -
The traced program terminates, before the stop-condition has been
satisfied.
.IP -
The stop-condition contains a variable or address which is no longer
accessible.
This can happen if a stop-condition contains, for instance,
a variable local to some routine and a return is executed from that
routine while the stop-condition is not yet satisfied.
An error message is produced and
control is given at the first breakpoint encountered.
.SE
.PP
Exactly the same rule as given for line-conditions applies to
stop-conditions:
the specification of a stop-condition does not terminate the
conversation at the current breakpoint.
A newline character is still required to end the dialogue and
continue execution under the new stop-condition.
.SH
External events
.PP
C-TRACER recognizes several
.ul
external events,
which result in the activation of the next breakpoint encountered.
An external event overrules a line-condition or stop-condition.
C-TRACER recognizes a subset of all possible run-time errors:
.SS
.IP -
interrupt (interrupt key hit).
In this way control may be obtained at an arbitrary moment
during program execution.
If however the terminal is in "raw" input mode (see
.ul
Shortcomings)
interrupts are ignored and the facility just mentioned
is not available.
.IP -
segmentation violation (addressing not allocated memory)
.IP -
bus error (odd address)
.SE
.LP
An appropriate message is given before the the next breakpoint is
activated.
.SH
Shortcomings
.PP
There are several restrictions and shortcomings in C-TRACER.
Some can be overcome by adding new features;
but traced programs which are larger and slower will probably be
the result.
Others are inherently difficult to deal with.
Some of these shortcomings are:
.SS
.IP -
C-TRACER uses the same run-time stack as the traced program does.
If the stack-organization is damaged by some cause
(for example: assigning to an illegal element of a local array.
C is an unsafe language!)
C-TRACER is not very useful anymore.
A solution might be to make a separate process of C-TRACER
and look into the traced program by means of the "ptrace"
system-call.
This construction is likely to give much system overhead,
especially when a stop-condition has been activated.
.IP -
C-TRACER reads from the standard input file on a line-by-line basis.
If the traced program reads from the standard input file on a
character-by-character basis, the problem occurs which
input characters must go to whom.
The user can solve this problem in two (not very elegant) ways:
.RS
.IP a)
Read not from the standard input file, but from another
file opened explicitly.
This solution works very well but is contradictory to the
C-TRACER philosophy that the source-text of a traced program
must remain unchanged.
.IP b)
Set the terminal in "raw" input mode.
In this mode input characters are send to the reading process
immediately, in contrast with "cooked" mode when the system buffers
characters internally before sending them to the reading process.
This solution works but is dangerous.
The traced program can not be killed if it loops infinitely
without giving control at some breakpoint.
The latter can occur if a non satisfiable stop-condition
or line-condition has been specified.
.RE
.IP -
Object modules of traced programs tend to become large.
This is caused both by the inserted breakpoints and by the fact that
all symbol-tables are kept in core.
It might be considered keeping the symbol-tables in a file,
at the price of slowing down program execution.
.IP -
Separate symbol-tables are generated for each source-text file,
containing routines to be traced.
This has the advantage that only parts of a program can be
compiled under the trace option.
But as a consequence C-TRACER can not find routine names
which are defined in another source file.
.IP -
Several debugging facilities are
.ul
not
implemented in C-TRACER.
These include: program editing, changing the flow-of-control
of the traced program (call routine X, goto label Y and so forth) and
probably many others.
.SE
.bp
.DS C
APPENDIX A
.sp
Summary of C-TRACER commands
.DE
.DS
input
	: line_condition
	| stop_condition
	| lvalue
	| lvalue = rvalue
	| lvalue / sub , sub /
	|
	.
.DE
.DS
lvalue
	: * lvalue
	| var
	| lvalue . var
	| lvalue -> var
	.
.DE
.DS
rvalue
	: constant
	| & lvalue
	.
.DE
.DS
var
	: identifier
	| identifier sublist
	.
.DE
.DS
sublist
	: sublist sub
	| sub
	.
.DE
.DS
sub
	: [ rvalue ]
	.
.DE
.DS
constant
	: char_con
	| + integer
	| - integer
	.
.DE
.DS
stop_condition
	: ( relation_list )
	.
.DE
.DS
relation_list
	: ( relation_list )
	| relation_list boolean_op relation
	| relation
	.
.DE
.DS
relation
	: lvalue relational_op rvalue
	.
.DE
.DS
boolean_op
	: ||
	| &&
	.
.DE
.DS
relational_op
	: >
	| <
	| >=
	| <=
	| ==
	| !=
	.
.DE
.DS
line_condition
	: / name_part / tail
	.
.DE
.DS
tail
	: range /
	| range / repeat
	|
	.
.DE
.DS
range
	: rvalue
	| rvalue , rvalue
	| , rvalue
	.
.DE
.DS
name_part
	: identifier
	|
	.
.DE
.SH
Notes:
.IP 1)
The notions identifier, integer and character constant are omitted,
they mirror the C equivalents.
.IP 2)
Numbers with a leading zero are interpreted as being octal.
.IP 3)
Only subscripts of type "int" or "char" are allowed.
.IP 4)
Only the first eight alphanumeric characters of an identifier are
significant.
.IP 5)
The externals used by C-TRACER internally all start with "xx".
Conflicts may be expected during linkage editing, if a user program
contains externals of the same form.
.IP 6)
Relational operators are commutative.
.IP 7)
Boolean operators ("||" and "&&") have equal priority and associate
from left to right.
