shithub: purgatorio

ref: a5cb451b299b03f44154fac5780b6a57ca130ce0
dir: /doc/acidpaper.ms/

View raw version
.TL
Acid: A Debugger Built From A Language
.AU
.I "Phil Winterbottom"
.AI
.I "Lucent Technologies Inc"
.AB
.FS
\l'1i'
.br
Originally appeared in
.I
Proc. of the Winter 1994 USENIX Conf.,
.R
pp. 211-222,
San Francisco, CA;
and subsequently in the
.I "Plan 9 Programmer's Manual, Volume 2 (Second Edition)" .
.FE
Acid is an unusual source-level symbolic debugger for Plan 9. It is implemented
as a language interpreter with specialized primitives that provide
debugger support.  Programs written in the language manipulate
one or more target processes; variables in the language represent the
symbols, state, and resources of those processes. 
This structure allows complex
interaction between the debugger and the target program and
provides a convenient method of parameterizing differences between
machine architectures.
Although some effort is required to learn
the debugging language, the richness and flexibility of the
debugging environment encourages new ways of reasoning about the way
programs run and the conditions under which they fail.
.AE
.NH
Introduction
.PP
The size and complexity
of programs have increased in proportion to processor speed and memory but
the interface between debugger and programmer has changed little.
Graphical user interfaces have eased some of the tedious
aspects of the interaction. A graphical interface is a convenient
means for navigating through source and data structures but provides
little benefit for process control.
The introduction of a new concurrent language, Alef [Win93], emphasized the
inadequacies of the existing Plan 9 [Pike90] debugger
.I db ,
a distant relative of
.I adb ,
and made it clear that a new debugger was required.
.PP
Current debuggers like
.I dbx ,
.I sdb ,
and
.I gdb
are limited to answering only the questions their authors
envisage.  As a result, they supply a plethora
of specialized commands, each attempting to anticipate
a specific question a user may ask.
When a debugging situation arises that is beyond the scope
of the command set, the tool is useless.
Further,
it is often tedious or impossible to reproduce an anomalous state
of the program, especially when
the state is embedded in the program's data structures.
.PP
Acid applies some ideas found in CAD software used for
hardware test and simulation.
It is based on the notion that the state and resources of a program
are best represented and manipulated by a language. The state and resources,
such as memory, registers, variables, type information and source code
are represented by variables in the language.
Expressions provide a computation mechanism and control
statements allow repetitive or selective interpretation based
on the result of expression evaluation.
The heart of the Acid debugger is an interpreter for a small typeless
language whose operators mirror the operations
of C and Alef, which in turn correspond well to the basic operations of
the machine. The interpreter itself knows nothing of the underlying
hardware; it deals with the program state and resources
in the abstract.
Fundamental routines to control
processes, read files, and interface to the system are implemented
as builtin functions available to the interpreter.
The actual debugger functionality is coded
in Acid; commands are implemented as Acid functions.
.PP
This language-based approach has several advantages.
Most importantly, programs written in Acid, including most of the
debugger itself, are inherently portable.
Furthermore, Acid avoids the limitations other debuggers impose when
debugging parallel programs.  Instead of embedding a fixed
process model in the debugger, Acid allows the
programmer to adapt the debugger to handle an
arbitrary process partitioning or program structure. 
The ability to
interact dynamically with an executing process provides clear advantages
over debuggers constrained to probe a static image.
Finally, the Acid language is a powerful vehicle for expressing
assertions about logic, process state, and the contents of data structures.
When combined with dynamic interaction it allows a
limited form of automated program verification without requiring
modification or recompilation of the source code.
The language is also an
excellent vehicle for preserving a test suite for later regression testing.
.PP
The debugger may be customized by its users; standard
functions may be modified or extended to suit a particular application
or preference.
For example, the kernel developers in our group require a
command set supporting assembler-level debugging while the application
programmers prefer source-level functionality.
Although the default library is biased toward assembler-level debugging,
it is easily modified to provide a convenient source-level interface.
The debugger itself does not change; the user combines primitives
and existing Acid functions in different ways to
implement the desired interface.
.NH
Related Work
.PP
DUEL [Gol93], an extension to
.I gdb
[Stal91], proposes using a high level expression evaluator to solve
some of these problems. The evaluator provides iterators to loop over data
structures and conditionals to control evaluation of expressions.
The author shows that complex state queries can be formulated
by combining concise expressions but this only addresses part of the problem.
A program is a dynamic entity; questions asked when the program is in
a static state are meaningful only after the program has been `caught' in
that state. The framework for manipulating the program is still as
primitive as the underlying debugger. While DUEL provides a means to
probe data structures it entirely neglects the most beneficial aspect
of debugging languages: the ability to control processes. Acid is structured
around a thread of control that passes between the interpreter and the
target program.
.PP
The NeD debugger [May92] is a set of extensions to TCL [Ous90] that provide
debugging primitives. The resulting language, NeDtcl, is used to implement
a portable interface between a conventional debugger, pdb [May90], and
a server that executes NeDtcl programs operating on the target program.
Execution of the NeDtcl programs implements the debugging primitives
that pdb expects.
NeD is targeted at multi-process debugging across a network,
and proves the flexibility of a language as a means of
communication between debugging tools. Whereas NeD provides an interface
between a conventional debugger and the process it debugs, Acid is the
debugger itself. While NeD has some of the ideas
found in Acid it is targeted toward a different purpose. Acid seeks to
integrate the manipulation of a program's resources into the debugger
while NeD provides a flexible interconnect between components of
the debugging environment. The choice of TCL is appropriate for its use
in NeD but is not suitable for Acid. Acid relies on the coupling of the type
system with expression evaluation, which are the root of its design,
to provide the debugging primitives.
.PP
Dalek [Ols90] is an event based language extension to gdb. State transitions
in the target program cause events to be queued for processing by the
debugging language.
.PP
Acid has many of the advantages of same process or
.I local
.I agent
debuggers, like Parasight [Aral], without the need for dynamic linking or
shared memory.
Acid improves on the ideas of these other systems by completely integrating
all aspects of the debugging process into the language environment. Of
particular importance is the relationship between Acid variables,
program symbols, source code, registers and type information. This
integration is made possible by the design of the Acid language.
.PP
Interpreted languages such as Lisp and Smalltalk are able to provide
richer debugging environments through more complete information than
their compiled counterparts. Acid is a means to gather and represent
similar information about compiled programs through cooperation
with the compilation tools and library implementers.
.NH
Acid the Language
.PP
Acid is a small interpreted language targeted to its debugging task.
It focuses on representing program state and addressing data rather than
expressing complex computations. Program state is
.I addressable
from an Acid program.
In addition to parsing and executing expressions and providing
an architecture-independent interface to the target process,
the interpreter supplies a mark-and-scan garbage collector
to manage storage.
.PP
Every Acid session begins with the loading of the Acid libraries.
These libraries contain functions, written in Acid, that provide
a standard debugging environment including breakpoint management,
stepping by instruction or statement, stack tracing, and
access to variables, memory, and registers.
The library contains 600 lines of Acid code and provides
functionality similar to
.I dbx .
Following the loading of the system library, Acid loads
user-specified libraries; this load sequence allows the
user to augment or override the standard commands
to customize the debugging environment.  When all libraries
are loaded, Acid issues an interactive prompt and begins
evaluating expressions entered by the user.  The Acid `commands'
are actually invocations of builtin primitives or previously defined
Acid functions. Acid evaluates each expression as it is entered and
prints the result.
.NH
Types and Variables
.PP
Acid variables are of four basic types:
.I integer ,
.I string ,
.I float ,
and
.I list .
The type of a variable is inferred by the type of the right-hand side of
an assignment expression.
Many of the operators can be applied to more than
one type; for these operators the action of the operator is determined
by the type of its operands.
For example,
the
.CW +
operator adds
.I integer
and
.I float
operands, and concatenates
.I string
and
.I list
operands.
Lists are the only complex type in Acid; there are no arrays, structures
or pointers. Operators provide
.CW head ,
.CW tail ,
.CW append
and
.CW delete
operations.
Lists can also be indexed like arrays.
.PP
Acid has two levels of scope: global and local.
Function parameters and variables declared in a function body
using the
.CW local
keyword are created at entry to the function and
exist for the lifetime of a function.
Global variables are created by assignment and need not be declared.
All variables and functions in the program
being debugged are entered in the Acid symbol table as global
variables during Acid initialization.
Conflicting variable names are resolved by prefixing enough `$' characters
to make them unique.
Syntactically, Acid variables and target program
symbols are referenced identically.
However, the variables are managed differently in the Acid
symbol table and the user must be aware of this distinction.
The value of an Acid variable is stored in the symbol
table; a reference returns the value.
The symbol table entry for a variable or function in the target
program contains the address of that symbol in the image
of the program.  Thus, the value of a program variable is
accessed by indirect reference through the Acid
variable that has the same name; the value of an Acid variable is the
address of the corresponding program variable.
.NH
Control Flow
.PP
The
.CW while
and
.CW loop
statements implement looping.
The former
is similar to the same statement in C.
The latter evaluates starting and ending expressions yielding
integers and iterates while an incrementing loop index
is within the bounds of those expressions.
.P1
acid: i = 0; loop 1,5 do print(i=i+1)
0x00000001
0x00000002
0x00000003
0x00000004
0x00000005
acid:
.P2
The traditional
.CW if-then-else 
statement implements conditional execution.
.NH
Addressing
.PP
Two indirection operators allow Acid to access values in
the program being debugged.
The
.CW *
operator fetches a value from the memory image of an
executing process;
the
.CW @
operator fetches a value from the text file of the process.
When either operator appears on the left side of an assignment, the value
is written rather than read.
.PP
The indirection operator must know the size of the object
referenced by a variable.
The Plan 9 compilers neglect to include this
information in the program symbol table, so Acid cannot
derive this information implicitly.
Instead Acid variables have formats.
The format is a code
letter specifying the printing style and the effect of some of the
operators on that variable.
The indirection operators look at the format code to determine the
number of bytes to read or write.
The format codes are derived from the format letters used by
.I db .
By default, symbol table variables and numeric constants
are assigned the format code
.CW 'X'
which specifies 32-bit hexadecimal.
Printing such a variable yields output of the form
.CW 0x00123456 .
An indirect reference through the variable fetches 32 bits
of data at the address indicated by the variable.
Other formats specify various data types, for example
.CW i
an instruction,
.CW D
a signed 32 bit decimal,
.CW s
a null-terminated string.
The
.CW fmt
function
allows the user to change the format code of a variable
to control the printing format and
operator side effects.
This function evaluates the expression supplied as the first
argument, attaches the format code supplied as the second
argument to the result and returns that value.
If the result is assigned to a variable,
the new format code applies to
that variable.  For convenience, Acid provides the
.CW \e
operator as a shorthand infix form of
.CW fmt .
For example:
.P1
acid: x=10
acid: x				 // print x in hex
0x0000000a 
acid: x = fmt(x, 'D')		 // make x type decimal
acid: print(x, fmt(x, 'X'), x\eX) // print x in decimal & hex
10 0x0000000a 0x0000000a
acid: x				 // print x in decimal
10
acid: x\eo			 // print x in octal
000000000012
.P2
The 
.CW ++
and
.CW --
operators increment or decrement a variable by an amount
determined by its format code.  Some formats imply a non-fixed size.
For example, the
.CW i
format code disassembles an instruction into a string.
On a 68020, which has variable length instructions:
.P1
acid: p=main\ei                     // p=addr(main), type INST
acid: loop 1,5 do print(p\eX, @p++) // disassemble 5 instr's
0x0000222e LEA	0xffffe948(A7),A7
0x00002232 MOVL	s+0x4(A7),A2
0x00002236 PEA	0x2f($0)
0x0000223a MOVL	A2,-(A7)
0x0000223c BSR	utfrrune
acid:
.P2
Here,
.CW main
is the address of the function of the same name in the program under test.
The loop retrieves the five instructions beginning at that address and
then prints the address and the assembly language representation of each.
Notice that the stride of the increment operator varies with the size of
the instruction: the
.CW MOVL
at 
.CW 0x0000223a
is a two byte instruction while all others are four bytes long.
.PP
Registers are treated as normal program variables referenced
by their symbolic assembler language names.
When a
process stops, the register set is saved by the kernel
at a known virtual address in the process memory map.
The Acid variables associated with the registers point
to the saved values and the
.CW *
indirection operator can then be used to read and write the register set.
Since the registers are accessed via Acid variables they may
be used in arbitrary expressions.
.P1
acid: PC                            // addr of saved PC
0xc0000f60 
acid: *PC
0x0000623c                          // contents of PC
acid: *PC\ea
main
acid: *R1=10                        // modify R1
acid: asm(*PC+4)                    // disassemble @ PC+4
main+0x4 0x00006240 	MOVW	R31,0x0(R29)
main+0x8 0x00006244 	MOVW	$setR30(SB),R30
main+0x10 0x0000624c 	MOVW	R1,_clock(SB)
.P2
Here, the saved
.CW PC
is stored at address
.CW 0xc0000f60 ;
its current content is
.CW 0x0000623c .
The
.CW a ' `
format code converts this value to a string specifying
the address as an offset beyond the nearest symbol.
After setting the value of register
.CW 1 ,
the example uses the
.CW asm
command to disassemble a short section of code beginning
at four bytes beyond the current value of the
.CW PC .
.NH
Process Interface
.PP
A program executing under Acid is monitored through the
.I proc
file system interface provided by Plan 9.
Textual messages written to the
.CW ctl
file control the execution of the process.
For example writing
.CW waitstop
to the control file causes the write to block until the target
process enters the kernel and is stopped. When the process is stopped
the write completes. The
.CW startstop
message starts the target process and then does a
.CW waitstop
action.
Synchronization between the debugger and the target process is determined
by the actions of the various messages. Some operate asynchronously to the
target process and always complete immediately, others block until the
action completes. The asynchronous messages allow Acid to control
several processes simultaneously.
.PP
The interpreter has builtin functions named after each of the control
messages. The functions take a process id as argument.
Any time a control message causes the program to execute instructions 
the interpreter performs two actions when the control operation has completed.
The Acid variables pointing at the register set are fixed up to point
at the saved registers, and then
the user defined function
.CW stopped
is executed.
The 
.CW stopped
function may print the current address,
line of source or instruction and return to interactive mode. Alternatively
it may traverse a complex data structure, gather statistics and then set
the program running again.
.PP
Several Acid variables are maintained by the debugger rather than the
programmer.
These variables allow generic Acid code to deal with the current process,
architecture specifics or the symbol table.
The variable
.CW pid
is the process id of the current process Acid is debugging.
The variable
.CW symbols
contains a list of lists where each sublist contains the symbol
name, its type and the value of the symbol.
The variable
.CW registers
contains a list of the machine-specific register names. Global symbols in the target program
can be referenced directly by name from Acid. Local variables
are referenced using the colon operator as \f(CWfunction:variable\fP.
.NH
Source Level Debugging
.PP
Acid provides several builtin functions to manipulate source code.
The
.CW file
function reads a text file, inserting each line into a list.
The
.CW pcfile
and
.CW pcline
functions each take an address as an argument.
The first
returns a string containing the name of the source file
and the second returns an integer containing the line number
of the source line containing the instruction at the address.
.P1
acid: pcfile(main)		// file containing main
main.c
acid: pcline(main)		// line # of main in source
11
acid: file(pcfile(main))[pcline(main)]	// print that line
main(int argc, char *argv[])
acid: src(*PC)			// print statements nearby
 9
 10 void
>11 main(int argc, char *argv[])
 12 {
 13	int a;
.P2
In this example, the three primitives are combined in an expression to print
a line of source code associated with an address.
The
.CW src
function prints a few lines of source
around the address supplied as its argument. A companion routine,
.CW Bsrc ,
communicates with the external editor
.CW sam .
Given an address, it loads the corresponding source file into the editor
and highlights the line containing the address.  This simple interface
is easily extended to more complex functions.
For example, the
.CW step
function can select the current file and line in the editor
each time the target program stops, giving the user a visual
trace of the execution path of the program. A more complete interface
allowing two way communication between Acid and the
.CW acme
user interface [Pike93] is under construction. A filter between the debugger
and the user interface provides interpretation of results from both
sides of the interface. This allows the programming environment to
interact with the debugger and vice-versa, a capability missing from the
.CW sam
interface.
The
.CW src
and
.CW Bsrc
functions are both written in Acid code using the file and line primitives.
Acid provides library functions to step through source level
statements and functions. Furthermore, addresses in Acid expressions can be
specified by source file and line.
Source code is manipulated in the Acid
.I list
data type.
.NH
The Acid Library
.PP
The following examples define some useful commands and
illustrate the interaction of the debugger and the interpreter.
.P1
defn bpset(addr)                          // set breakpoint
{
	if match(addr, bplist) >= 0 then
		print("bkpoint already set:", addr\ea, "\en");
	else {
		*fmt(addr, bpfmt) = bpinst;   // plant it
		bplist = append bplist, addr; // add to list
	}
}
.P2
The
.CW bpset
function plants a break point in memory. The function starts by
using the
.CW match
builtin to
search the breakpoint list to determine if a breakpoint is already
set at the address.
The indirection operator, controlled by the format code returned
by the
.CW fmt
primitive, is used to plant the breakpoint in memory.
The variables
.CW bpfmt
and
.CW bpinst
are Acid global variables containing the format code specifying
the size of the breakpoint instruction and the breakpoint instruction
itself.
These
variables are set by architecture-dependent library code
when the debugger first attaches to the executing image.
Finally the address of the breakpoint is
appended to the breakpoint list,
.CW bplist .
.P1
defn step()				// single step
{
	local lst, lpl, addr, bput;

	bput = 0;			// sitting on bkpoint
	if match(*PC, bplist) >= 0 then {	
		bput = fmt(*PC, bpfmt);	// save current addr
		*bput = @bput;		// replace it
	}

	lst = follow(*PC);		// get follow set

	lpl = lst;
	while lpl do {			// place breakpoints
		*(head lpl) = bpinst;
		lpl = tail lpl;
	}

	startstop(pid);			// do the step

	while lst do {			// remove breakpoints
		addr = fmt(head lst, bpfmt);
		*addr = @addr;		// replace instr.
		lst = tail lst;
	}
	if bput != 0 then
		*bput = bpinst;		// restore breakpoint
}
.P2
The
.CW step
function executes a single assembler instruction.
If the
.CW PC
is sitting
on a breakpoint, the address and size of
the breakpoint are saved.
The breakpoint instruction
is then removed using the
.CW @
operator to fetch
.CW bpfmt
bytes from the text file and to place it into the memory
of the executing process using the
.CW *
operator.
The
.CW follow
function is an Acid
builtin which returns a follow-set: a list of instruction addresses which
could be executed next.
If the instruction stored at the
.CW PC
is a branch instruction, the
list contains the addresses of the next instruction and
the branch destination; otherwise, it contains only the
address of the next instruction.
The follow-set is then used to replace each possible following
instruction with a breakpoint instruction.  The original
instructions need not be saved; they remain
in their unaltered state in the text file.
The
.CW startstop
builtin writes the `startstop' message to the
.I proc
control file for the process named
.CW pid .
The target process executes until some condition causes it to
enter the kernel, in this case, the execution of a breakpoint.
When the process blocks, the debugger regains control and invokes the
Acid library function
.CW stopped
which reports the address and cause of the blockage.
The
.CW startstop
function completes and returns to the
.CW step
function where
the follow-set is used to replace the breakpoints placed earlier.
Finally, if the address of the original
.CW PC
contained a breakpoint, it is replaced.
.PP
Notice that this approach to process control is inherently portable;
the Acid code is shared by the debuggers for all architectures.
Acid variables and builtin functions provide a transparent interface
to architecture-dependent values and functions.  Here the breakpoint
value and format are referenced through Acid variables and the
.CW follow
primitive masks the differences in the underlying instruction set.
.PP
The
.CW next
function, similar to the
.I dbx
command of the same name,
is a simpler example.
This function steps through
a single source statement but steps over function calls.
.P1
defn next()
{
	local sp, bound;

	sp = *SP;			// save starting SP
	bound = fnbound(*PC);		// begin & end of fn.
	stmnt();			// step 1 statement
	pc = *PC;
	if pc >= bound[0] && pc < bound[1] then
		return {};

	while (pc<bound[0] || pc>bound[1]) && sp>=*SP do {
		step();
		pc = *PC;
	}
	src(*PC);
}
.P2
The
.CW next
function
starts by saving the current stack pointer in a local variable.
It then uses the Acid library function
.CW fnbound
to return the addresses of the first and last instructions in
the current function in a list.
The
.CW stmnt
function executes a single source statement and then uses
.CW src
to print a few lines of source around the new
.CW PC .
If the new value of the
.CW PC
remains in the current function,
.CW next
returns.
When the executed statement is a function call or a return
from a function, the new value of the
.CW PC
is outside the bounds calculated by
.CW fnbound 
and the test of the
.CW while
loop is evaluated.
If the statement was a return, the new value of the stack pointer
is greater than the original value and the loop completes without
execution.
Otherwise, the loop is entered and instructions are continually
executed until the value of the
.CW PC
is between the bounds calculated earlier.  At that point, execution
ceases and a few lines of source in the vicinity of the
.CW PC
are printed.
.PP
Acid provides concise and elegant expression for control and
manipulation of target programs. These examples demonstrate how a
few well-chosen primitives can be combined to create a rich debugging environment.
.NH
Dealing With Multiple Architectures
.PP
A single binary of Acid may be used to debug a program running on any
of the five processor architectures supported by Plan 9.  For example,
Plan 9 allows a user on a MIPS to import the
.I proc
file system from an i486-based PC and remotely debug a program executing
on that processor.
.PP
Two levels of abstraction provide this architecture independence.
On the lowest level, a Plan 9 library supplies functions to
decode the file header of the program being debugged and
select a table of system parameters
and a jump vector of architecture-dependent
functions based on the magic number.
Among these functions are byte-order-independent
access to memory and text files, stack manipulation, disassembly,
and floating point number interpretation.
The second level of abstraction is supplied by Acid.
It consists of primitives and approximately 200 lines
of architecture-dependent Acid library code that interface the
interpreter to the architecture-dependent library.
This layer performs functions such as mapping register names to
memory locations, supplying breakpoint values and sizes,
and converting processor specific data to Acid data types.
An example of the latter is the stack trace function
.CW strace ,
which uses the stack traversal functions in the
architecture-dependent library to construct a list of lists describing
the context of a process.  The first level of list selects
each function in the trace; subordinate lists contain the
names and values of parameters and local variables of
the functions.  Acid commands and library functions that
manipulate and display process state information operate
on the list representation and are independent of the
underlying architecture.
.NH
Alef Runtime
.PP
Alef is a concurrent programming language,
designed specifically for systems programming, which supports both
shared variable and message passing paradigms.
Alef borrows the C expression syntax but implements
a substantially different type system.
The language provides a rich set of 
exception handling, process management, and synchronization
primitives, which rely on a runtime system.
Alef program bugs are often deadlocks, synchronization failures,
or non-termination caused by locks being held incorrectly.
In such cases, a process stalls deep
in the runtime code and it is clearly
unreasonable to expect a programmer using the language
to understand the detailed
internal semantics of the runtime support functions.
.PP
Instead, there is an Alef support library, coded in Acid, that
allows the programmer to interpret the program state in terms of
Alef operations.  Consider the example of a multi-process program
stalling because of improper synchronization.  A stack trace of
the program indicates that it is waiting for an event in some
obscure Alef runtime
synchronization function.
The function itself is irrelevant to the
programmer; of greater importance is the identity of the
unfulfilled event.
Commands in the Alef support library decode
the runtime data structures and program state to report the cause
of the blockage in terms of the high-level operations available to
the Alef programmer.  
Here, the Acid language acts
as a communications medium between Alef implementer and Alef user.
.NH
Parallel Debugging
.PP
The central issue in parallel debugging is how the debugger is
multiplexed between the processes comprising
the program.
Acid has no intrinsic model of process partitioning; it
only assumes that parallel programs share a symbol table,
though they need not share memory.
The
.CW setproc
primitive attaches the debugger to a running process
associated with the process ID supplied as its argument
and assigns that value to the global variable
.CW pid ,
thereby allowing simple rotation among a group of processes.
Further, the stack trace primitive is driven by parameters
specifying a unique process context, so it is possible to
examine the state of cooperating processes without switching
the debugger focus from the process of interest.
Since Acid is inherently extensible and capable of
dynamic interaction with subordinate processes, the
programmer can define Acid commands to detect and control
complex interactions between processes.
In short, the programmer is free to specify how the debugger reacts
to events generated in specific threads of the program.
.PP
The support for parallel debugging in Acid depends on a crucial kernel
modification: when the text segment of a program is written (usually to
place a breakpoint), the segment is cloned to prevent other threads
from encountering the breakpoint.  Although this incurs a slight performance
penalty, it is of little importance while debugging.
.NH
Communication Between Tools
.PP
The Plan 9 Alef and C compilers do not
embed detailed type information in the symbol table of an
executable file.
However, they do accept a command line option causing them to
emit descriptions of complex data types
(e.g., aggregates and abstract data types)
to an auxiliary file.
The vehicle for expressing this information is Acid source code.
When an Acid debugging session is 
subsequently started, that file is loaded with the other Acid libraries.
.PP
For each complex object in the program the compiler generates
three pieces of Acid code.
The first is a table describing the size and offset of each
member of the complex data type.  Following is an Acid function,
named the same as the object, that formats and prints each member.
Finally, Acid declarations associate the
Alef or C program variables of a type with the functions
to print them.
The three forms of declaration are shown in the following example:
.P1
struct Bitmap {
	Rectangle    0 r;
	Rectangle   16 clipr;
	'D'   32 ldepth;
	'D'   36 id;
	'X'   40 cache;
};
.P2
.P1
defn
Bitmap(addr) {
	complex Bitmap addr;
	print("Rectangle r {\en");
	Rectangle(addr.r);
	print("}\en");
	print("Rectangle clipr {\en");
	Rectangle(addr.clipr);
	print("}\en");
	print("	ldepth	", addr.ldepth, "\en");
	print("	id	", addr.id, "\en");
	print("	cache	", addr.cache, "\en");
};

complex Bitmap darkgrey;
complex Bitmap Window_settag:b;
.P2
The
.CW struct
declaration specifies decoding instructions for the complex type named
.CW Bitmap .
Although the syntax is superficially similar to a C structure declaration,
the semantics differ markedly: the C declaration specifies a layout, while
the Acid declaration tells how to decode it.
The declaration specifies a type, an offset, and name for each
member of the complex object. The type is either the name of another
complex declaration, for example,
.CW Rectangle ,
or a format code.
The offset is the number of bytes from the start
of the object to the member
and the name is the member's name in the Alef or C declaration.
This type description is a close match for C and Alef, but is simple enough
to be language independent.
.PP
The
.CW Bitmap
function expects the address of a
.CW Bitmap
as its only argument.
It uses the decoding information contained in the
.CW Bitmap
structure declaration to extract, format, and print the
value of each member of the complex object pointed to by
the argument.
The Alef compiler emits code to call other Acid functions
where a member is another complex type; here,
.CW Bitmap
calls
.CW Rectangle
to print its contents.
.PP
The
.CW complex
declarations associate Alef variables with complex types.
In the example,
.CW darkgrey
is the name of a global variable of type
.CW Bitmap
in the program being debugged.
Whenever the name
.CW darkgrey
is evaluated by Acid, it automatically calls the
.CW Bitmap
function with the address of
.CW darkgrey
as the argument.
The second
.CW complex
declaration associates a local variable or parameter named
.CW b
in function
.CW Window_settag
with the
.CW Bitmap
complex data type.
.PP
Acid borrows the C operators
.CW .
and
.CW ->
to access the decoding parameters of a member of a complex type.
Although this representation is sufficiently general for describing
the decoding of both C and Alef complex data types, it may
prove too restrictive for target languages with more complicated
type systems.
Further, the assumption that the compiler can select the proper
Acid format code for each basic type in the language is somewhat
naive.  For example, when a member of a complex type is a pointer,
it is assigned a hexadecimal type code; integer members are always 
assigned a decimal type code.
This heuristic proves inaccurate when an integer field is a
bit mask or set of bit flags which are more appropriately displayed
in hexadecimal or octal.
.NH
Code Verification
.PP
Acid's ability to interact dynamically with
an executing program allows passive test and
verification of the target program.  For example,
a common concern is leak detection in programs using
.CW malloc .
Of interest are two items: finding memory that was allocated
but never freed and detecting bad pointers passed to
.CW free .
An auxiliary Acid library contains Acid functions to
monitor the execution of a program and detect these
faults, either as they happen or in the automated
post-mortem analysis of the memory arena.
In the following example, the
.CW sort
command is run under the control of the
Acid memory leak library.
.P1
helix% acid -l malloc /bin/sort
/bin/sort: mips plan 9 executable
/lib/acid/port
/lib/acid/mips
/lib/acid/malloc
acid: go()
now
is
the
time
<ctrl-d>
is
now
the
time
27680 : breakpoint	_exits+0x4	MOVW	$0x8,R1
acid: 
.P2
The
.CW go
command creates a process and plants
breakpoints at the entry to
.CW malloc
and
.CW free .
The program is then started and continues until it
exits or stops.  If the reason for stopping is anything
other than the breakpoints in
.CW malloc
and
.CW free ,
Acid prints the usual status information and returns to the
interactive prompt.
.PP
When the process stops on entering
.CW malloc ,
the debugger must capture and save the address that
.CW malloc
will return.
After saving a stack
trace so the calling routine can be identified, it places
a breakpoint at the return address and restarts the program.
When
.CW malloc
returns, the breakpoint stops the program,
allowing the debugger
to grab the address of the new memory block from the return register.
The address and stack trace are added to the list of outstanding
memory blocks, the breakpoint is removed from the return point, and
the process is restarted.
.PP
When the process stops at the beginning of
.CW free ,
the memory address supplied as the argument is compared to the list
of outstanding memory blocks.  If it is not found an error message
and a stack trace of the call is reported; otherwise, the
address is deleted from the list.
.PP
When the program exits, the list of outstanding memory blocks contains
the addresses of all blocks that were allocated but never freed.
The
.CW leak
library function traverses the list producing a report describing
the allocated blocks.
.P1 1m
acid: leak()
Lost a total of 524288 bytes from:
    malloc() malloc.c:32 called from dofile+0xe8 sort.c:217 
    dofile() sort.c:190 called from main+0xac sort.c:161 
    main() sort.c:128 called from _main+0x20 main9.s:10 
Lost a total of 64 bytes from:
    malloc() malloc.c:32 called from newline+0xfc sort.c:280 
    newline() sort.c:248 called from dofile+0x110 sort.c:222 
    dofile() sort.c:190 called from main+0xac sort.c:161 
    main() sort.c:128 called from _main+0x20 main9.s:10 
Lost a total of 64 bytes from:
    malloc() malloc.c:32 called from realloc+0x14 malloc.c:129 
    realloc() malloc.c:123 called from bldkey+0x358 sort.c:1388 
    buildkey() sort.c:1345 called from newline+0x150 sort.c:285 
    newline() sort.c:248 called from dofile+0x110 sort.c:222 
    dofile() sort.c:190 called from main+0xac sort.c:161 
    main() sort.c:128 called from _main+0x20 main9.s:10
acid: refs()
data...bss...stack...
acid: leak()
acid: 
.P2
The presence of a block in the allocation list does not imply
it is there because of a leak; for instance, it may have been
in use when the program terminated.
The
.CW refs()
library function scans the
.I data ,
.I bss ,
and
.I stack
segments of the process looking for pointers
into the allocated blocks.  When one is found, the block is deleted from
the outstanding block list.
The
.CW leak
function is used again to report the
blocks remaining allocated and unreferenced.
This strategy proves effective in detecting
disconnected (but non-circular) data structures.
.PP
The leak detection process is entirely passive.
The program is not
specially compiled and the source code is not required.
As with the Acid support functions for the Alef runtime environment,
the author of the library routines has encapsulated the
functionality of the library interface
in Acid code.
Any programmer may then check a program's use of the
library routines without knowledge of either implementation.
The performance impact of running leak detection is great
(about 10 times slower),
but it has not prevented interactive programs like
.CW sam
and the
.CW 8½
window system from being tested.
.NH
Code Coverage
.PP
Another common component of software test uses 
.I coverage 
analysis.
The purpose of the test is to determine which paths through the code have
not been executed while running the test suite.
This is usually
performed by a combination of compiler support and a reporting tool run
on the output generated by statements compiled into the program.
The compiler emits code that
logs the progress of the program as it executes basic blocks and writes the
results to a file. The file is then processed by the reporting tool 
to determine which basic blocks have not been executed.
.PP
Acid can perform the same function in a language independent manner without
modifying the source, object or binary of the program. The following example
shows
.CW ls
being run under the control of the Acid coverage library.
.P1
philw-helix% acid -l coverage /bin/ls
/bin/ls: mips plan 9 executable
/lib/acid/port
/lib/acid/mips
/lib/acid/coverage
acid: coverage()
acid
newstime
profile
tel
wintool
2: (error) msg: pid=11419 startstop: process exited
acid: analyse(ls)
ls.c:102,105
	102:     return 1;
	103: }
	104: if(db[0].qid.path&CHDIR && dflag==0){
	105:     output();
ls.c:122,126
	122:     memmove(dirbuf+ndir, db, sizeof(Dir));
	123:     dirbuf[ndir].prefix = 0;
	124:     p = utfrrune(s, '/');
	125:     if(p){
	126:         dirbuf[ndir].prefix = s;
.P2
The
.CW coverage
function begins by looping through the text segment placing
breakpoints at the entry to each basic block. The start of each basic
block is found using the Acid builtin function
.CW follow .
If the list generated by
.CW follow 
contains more than one
element, then the addresses mark the start of basic blocks. A breakpoint
is placed at each address to detect entry into the block. If the result
of
.CW follow
is a single address then no action is taken, and the next address is
considered. Acid maintains a list of
breakpoints already in place and avoids placing duplicates (an address may be
the destination of several branches).
.PP
After placing the breakpoints the program is set running.
Each time a breakpoint is encountered
Acid deletes the address from the breakpoint list, removes the breakpoint
from memory and then restarts the program.
At any instant the breakpoint list contains the addresses of basic blocks
which have not been executed. 
The
.CW analyse
function reports the lines of source code bounded by basic blocks
whose addresses are have not been deleted from the breakpoint list.
These are the basic blocks which have not been executed.
Program performance is almost unaffected since each breakpoint is executed
only once and then removed.
.PP
The library contains a total of 128 lines of Acid code.
An obvious extension of this algorithm could be used to provide basic block
profiling.
.NH
Conclusion
.PP
Acid has two areas of weakness. As with
other language-based tools like
.I awk ,
a programmer must learn yet another language to step beyond the normal
debugging functions and use the full power of the debugger.
Second, the command line interface supplied by the
.I yacc
parser is inordinately clumsy.
Part of the problem relates directly to the use of
.I yacc
and could be circumvented with a custom parser.
However, structural problems would remain: Acid often requires
too much typing to execute a simple
command.
A debugger should prostitute itself to its users, doing whatever
is wanted with a minimum of encouragement; commands should be
concise and obvious. The language interface is more consistent than
an ad hoc command interface but is clumsy to use.
Most of these problems are addressed by an Acme interface
which is under construction. This should provide the best of
both worlds: graphical debugging and access to the underlying acid
language when required.
.PP
The name space clash between Acid variables, keywords, program variables,
and functions is unavoidable.
Although it rarely affects a debugging session, it is annoying
when it happens and is sometimes difficult to circumvent.
The current renaming scheme
is too crude; the new names are too hard to remember.
.PP
Acid has proved to be a powerful tool whose applications
have exceeded expectations.
Of its strengths, portability, extensibility and parallel debugging support
were by design and provide the expected utility.
In retrospect,
its use as a tool for code test and verification and as
a medium for communicating type information and encapsulating
interfaces has provided unanticipated benefits and altered our
view of the debugging process.
.NH
Acknowledgments
.PP
Bob Flandrena was the first user and helped prepare the paper.
Rob Pike endured three buggy Alef compilers and a new debugger
in a single sitting.
.NH
References
.LP
[Pike90] R. Pike, D. Presotto, K. Thompson, H. Trickey,
``Plan 9 from Bell Labs'',
.I
UKUUG Proc. of the Summer 1990 Conf.,
.R
London, England,
1990.
.LP
[Gol93] M. Golan, D. Hanson,
``DUEL -- A Very High-Level Debugging Language'',
.I
USENIX Proc. of the Winter 1993 Conf.,
.R
San Diego, CA,
1993.
.LP
[Lin90] M. A. Linton,
``The Evolution of DBX'',
.I
USENIX Proc. of the Summer 1990 Conf.,
.R
Anaheim, CA,
1990.
.LP
[Stal91] R. M. Stallman, R. H. Pesch,
``Using GDB: A guide to the GNU source level debugger'',
Technical Report, Free Software Foundation,
Cambridge, MA,
1991.
.LP
[Win93] P. Winterbottom,
``Alef reference Manual'',
reprinted in this volume.
.LP
[Pike93] Rob Pike,
``Acme: A User Interface for Programmers'',
.I
USENIX Proc. of the Winter 1994 Conf.,
.R
San Francisco, CA,
reprinted in this volume.
.LP
[Ols90] Ronald A. Olsson, Richard H. Crawford, and W. Wilson Ho,
``Dalek: A GNU, improved programmable debugger'',
.I
USENIX Proc. of the Summer 1990 Conf.,
.R
Anaheim, CA.
.LP
[May92] Paul Maybee,
``NeD: The Network Extensible Debugger''
.I
USENIX Proc. of the Summer 1992 Conf.,
.R
San Antonio, TX.
.LP
[Aral] Ziya Aral, Ilya Gertner, and Greg Schaffer,
``Efficient debugging primitives for multiprocessors'',
.I
Proceedings of the Third International Conference on Architectural
Support for Programming Languages and Operating Systems,
.R
SIGPLAN notices Nr. 22, May 1989.