Archive for June, 2005

YATT Yet Another Truss Testimonial

Wednesday, June 8th, 2005

Truss has got to be the single most useful system utility.

It’s too bad that the command line options are so hard to figure out. The options themselves aren’t that bad, (although there are a lot) it’s just that the man page needs to be rewritten from the top down to account for the newer functionality, and describe it in a comprehensive way. (ha ha, “comprehensive” and “man page”. A contradiction in terms)

The compiler is a terrible thing to debug. (Ours is anyway.) There is a driver front end that parses the command line options, and then runs a sequence of different compiler phases. I maintain a library that’s called in different ways from each of of these phases. So in order to understand how my library is being called “from the compiler” I have to inspect what’s happening in each different phase.

Like some other complicated systems, there is a logging system that can record almost all of my library’s API usage in an automatic way. Usually this suffices for figuring out what’s going on. But today it wasn’t quite enough. The trace log wasn’t telling me which phase was calling which routines. Usually you can guess, but I didn’t want to guess.

Since truss can trace user level calls in shared libraries, I figured I’d give it a whirl. My library is called “lib_I_dbg_gen.so.1” and the function I wanted to watch was called “dbg_set_state”.

I use the -f option to truss because I have to trace through all the children of the compiler driver. I trace the fork/exec system calls so I can see what the child process is. And I tell truus to watch for PLT calls look to the function of interest. (This only works with shared library interfaces, but these days that a very wide net)

% truss -f -texec,fork -ulib_I_dbg_gen:dbg_set_state   cc -xO2 -c t.c

10429:	execve("/set/mercury/dist/fcs.2/sparc-S2/opt/SUNWspro/.../cc",
10429/1:	fork1()						= 10430
10430/1:	fork1()		(returning as child ...)	= 10429
10430/1:	execve("/set/mercury/dist/fcs.2/sparc-S2/opt/SUNWsp
10429/1:	fork1()						= 10432
10432/1:	fork1()		(returning as child ...)	= 10429
10432/1:	execve("/set/mercury/dist/fcs.2/sparc-S2/opt/SUNWspro/
10432/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x0, 0x0, 0x0, 0x0)
10432/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0xff1eae90
10432/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x4ee000, 0x0, 0x0, 0x4ee0a0)
10432/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0
10432/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x0, 0x15f, 0x0, 0xff1e93e8)
10432/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0x4ee000
10429/1:	fork1()						= 10434
10434/1:	fork1()		(returning as child ...)	= 10429
10434/1:	execve("/set/mercury/dist/fcs.2/sparc-S2/opt/.../cg
10434/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x0, 0x0, 0x0, 0x872564)
10434/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0xff1eae90
10434/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x872000, 0xff1e93e8, 
10434/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0

Presto! I can see exactly who’s calling my function. (I truncated the long pathnames for this message)

Of course, it would be nice to see each line labelled with the name of the process that causes that trace line, along with its parents, so I have a script stashed away for that purpose:

% truss .... 2>&1 | truss_anno

          cc, 16432:	execve("/set/vulcan/dist/build14.0/...cc", ...
          cc, 16432/1:	fork1()						= 16433
       cc/??, 16433/1:	fork1()		(returning as child ...)	= 16432
    cc/acomp, 16433/1:	execve("/set/vulcan/dist/build14.0/...acomp", ...
          cc, 16432/1:	fork1()						= 16435
       cc/??, 16435/1:	fork1()		(returning as child ...)	= 16432
    cc/iropt, 16435/1:	execve("/set/vulcan/dist/build14.0/...iropt", ...
    cc/iropt, 16435/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x0, 0x0, 0x0, 0x0)
    cc/iropt, 16435/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0xff1eb5a0
    cc/iropt, 16435/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x4e6000, 0x0, 0x0, 0x4e60a0)
    cc/iropt, 16435/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0
    cc/iropt, 16435/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x0, 0x15f, 0x0, 0xff1e9b00)
    cc/iropt, 16435/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0x4e6000
          cc, 16432/1:	fork1()						= 16437
       cc/??, 16437/1:	fork1()		(returning as child ...)	= 16432
       cc/cg, 16437/1:	execve("/set/vulcan/dist/build14.0/...cg", ...
       cc/cg, 16437/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x0, 0x0, 0x0, 0x8724b0)
       cc/cg, 16437/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0xff1eb5a0
       cc/cg, 16437/1@1:	-> lib_I_dbg_gen:dbg_set_state(0x872000, 0xff1e9b00, 0x0, 0x1806bc)
       cc/cg, 16437/1@1:	<- lib_I_dbg_gen:dbg_set_state() = 0

Now, the truss_anno script isn’t such a big deal for a small test case like this, but when you’re using truss to debug a Makefile it’s a life-saver. One nice thing is that it’s stream-based, so you can use it as a pipe filter. You don’t have to collect the output to a file and then process it in a seperate step.

Perl is the tool that showed my I could conjur up complicated data structures (like trees of records) in almost no time. I’m sure I’ll rant about how wonderful that is at some future point. 😉

Stabs versus dwarf

Tuesday, June 7th, 2005

Stabs versus dwarf

The Sun compilers are currently undergoing a transition from stabs to dwarf. It sounds like the kind of undertaking where a +2 longsword might come in handy, but no. Leave your +2 sword in your three ring binder, and fire up your Sun Studio compilers to see what I’m talking about.

Stabs and dwarf are both different formats of debugging information. The Sun Studio 10 C compiler supports a command line option -xdebugformat=dwarf which tells the C compiler to spit out dwarf data instead of stabs. Eventually we’ll be transitioning all the compilers to use dwarf by default. Dwarf has a number of advantages.

  • It’s much easier for platform independant tools to read dwarf. Instead of parsing the more esoteric stabs. (See below if you dare)
  • Dwarf is actively being developed by a group of engineers working on compilers and tools. (See the working group’s home page)
  • Dwarf is more easily extendable, so that when you want to implement advanced features (say optimized code debugging) adding new data in a structured way is a piece of cake. New data can easily be ignored by tools that don’t understand it.

stabs

In the beginning, the symbol information you needed to debug a program was simple. The name of the function, where it started, where (in the assembly code) line 7 turned into line 8, etc. You needed the names of the local variables and parameters and what their stack offsets are. That was about it. Then came C++. Then came C++ with templates. Then came function template instances with template arguments.

The stabs format for debugging information is basically a large array of fixed-size records. It has one enum, a few small integer arguments, and one string. So when extensions were added, the string just kept getting uglier and uglier. Here’s an example from the stabs document:

template int tfex( B* x, A y ) { ... }

This turns into:

.stabs "__1cEtfex3CTACTB_6Fp10_i_:YTf
A:tYC(0,19);B:tYC(0,20); // define template params A and B
@;;__1cEtfex3CTACTB_6Fp10_i_
:T(0,3);(0,21)=*(0,20);(0,19); // returns int, gets (B*, A)
@;
x:p(0,21);y:p(0,19);// name function params x and y
@;1;1;",// line number of template source.
N_GSYM,0x0,0x0,0x0// rest of stab

Of course in the wild it just looks like:

(CSS aside: Boy, getting this long line to be wrapped by the browser to fit inside the width of the current page seems to be impossible! How disappointing. Maybe the roller blog system will pretty it up for me.)

.stabs "__1cEtfex3CTACTB_6Fp10_i_:YTfA:tYC(0,19);B:tYC(0,20)
;@;;__1cEtfex3CTACTB_6Fp10_i_:T(0,3);(0,21)=
*(0,20);(0,19);@;x:p(0,21);y:p(0,19);@;1;1;",N_GSYM,0x0,0x0,0x0

(Nope, I had to put line breaks in)

The things that look like (0,3) are references to types that might be defined in other stabs. If you want to know how dbx makes sense of all that goop, just take a look at the stabs document. All the really juicy bits are described under the Symbol Descriptors section.

If you want to see what the function name looks like in the source program, you can demangle it by running the output through c++filt.

% echo __1cEtfex3CTACTB_6Fp10_i_ | c++filt
int tfex<__type_0,__type_1>(__type_1*,__type_0)

You can see the stabs in a program by using dumpstabs a.out

dwarf

The dwarf format is more structured. You can see the output of dwarf using the dwarfdump command. Here is some example dwarfdump output:

% dwarfdump -a a.out
...
<1><  405>      DW_TAG_base_type
                DW_AT_name                  int
                DW_AT_encoding              DW_ATE_signed
                DW_AT_byte_size             4
...
...
<1><  713>      DW_TAG_class_type
                DW_AT_name                  c4
                DW_AT_decl_file             1
                DW_AT_decl_line             2
                DW_AT_byte_size             16
<2><  729>      DW_TAG_member
                DW_AT_name                  ii
                DW_AT_type                  <405>
                DW_AT_data_member_location  DW_OP_plus_uconst 0
                DW_AT_accessibility         DW_ACCESS_public
<2><  741>      DW_TAG_member
                DW_AT_name                  dd
                DW_AT_type                  <562>

It’s much easier to read the output of dwarfdump than the output of dumpstabs. The numbers like <405> point to other TAGs in the same dwarf file. So you can just search for that number in the file, to see what the type of that member is.

Implementing Dwarf support in the Sun compilers has occupied a good chunk of my time for the last several years (on and off), so you’re bound to hear more about it from me. Stay tuned.