Archive for the ‘dbx’ Category

debug info in XML, and DSD 2.0

Sunday, September 18th, 2005

I’ve been working in my spare time on the idea of converting dwarf debugging information into XML so that I can format it as XHTML using a stylesheet, and so I can check it using a Schema of some sort. When I started fiddling today I assumed that using a DTD was the way to go, that what one does with XML, no? Well after banging my head against the DTD format for a while, and looking for help on (our friend) the internet, I stumbled across a more general description of different ways to write XML schemas alternatives. Notice the lower case schema there. One of the several ways is called simply “XML Schema”, which is an alternative to DTD and DSD etc. Don’t get confused yet, you just started reading.

I had to kick myself in the head again tonight. Every time I get really stumped on trying to find good information on the internet, I end up realizing that everything I wanted was already there in Wikipedia. In the really confusing situations where I’m jumping in the deep end of the pool, all I really need to start me out is a two page summary of the state of the art. something to put all the technology jargon into context for me. But I still haven’t learned to look on Wikipedia first. *kick* *ouch*

God, lists look ugly in the hacked theme I’m using. I should fix them up one day. Anyway, here is good info on the different ways to formally describe your XML so that you can check it, and make sure it doesn’t have bugs.

From oldest and klunkiest to newest and hottest, the different languages are: 1) DTD 2) “XML Schema” 3) DSD 2.0

Note: Clayton Wheeler also pointed me at Relax NG, which is much more elegant.

simple umem integration with dbx

Monday, July 25th, 2005

The mdb debugger has some really nifty integration with libumem, as documented in this technical article, and Adam Leventhal’s famous Top 20 Blog.

I got a request recently asking if dbx had similar features. I think the engineer who asked was already familiar with the memory checking features dbx has (Run Time Checking), but the simplicity and speed of libumem checking have their own advantages.

So anyway, after a bunch of fiddling around in my C.S.T. (copious spare time) I got a ksh script together that I’m not completely embarrassed to share with the world. It has a few basic functions like:

  • Dump the transaction log
  • Show all the transaction log entries for a specific block, along with the stack trace for each log entry
  • Given any address, search the transaction log for any blocks containing that address

The dbx lib umem module has a simple demo script in the documentation and a few tech notes. Please try it out if you’re interested, and let me know how it goes. If you have trouble downloading it because it doesn’t have a .txt extension let me know.

Update:

There is a Solaris bugid asking for libumem to export this information in a stable form. It is bugid:
6297789 libumem could use a libumem_db

In my never ending battle to figure out Technorati, here we go again:

Stabs versus dwarf

Tuesday, June 7th, 2005

Stabs versus dwarf

The Sun compilers are currently undergoing a transition from stabs to dwarf. It sounds like the kind of undertaking where a +2 longsword might come in handy, but no. Leave your +2 sword in your three ring binder, and fire up your Sun Studio compilers to see what I’m talking about.

Stabs and dwarf are both different formats of debugging information. The Sun Studio 10 C compiler supports a command line option -xdebugformat=dwarf which tells the C compiler to spit out dwarf data instead of stabs. Eventually we’ll be transitioning all the compilers to use dwarf by default. Dwarf has a number of advantages.

  • It’s much easier for platform independant tools to read dwarf. Instead of parsing the more esoteric stabs. (See below if you dare)
  • Dwarf is actively being developed by a group of engineers working on compilers and tools. (See the working group’s home page)
  • Dwarf is more easily extendable, so that when you want to implement advanced features (say optimized code debugging) adding new data in a structured way is a piece of cake. New data can easily be ignored by tools that don’t understand it.

stabs

In the beginning, the symbol information you needed to debug a program was simple. The name of the function, where it started, where (in the assembly code) line 7 turned into line 8, etc. You needed the names of the local variables and parameters and what their stack offsets are. That was about it. Then came C++. Then came C++ with templates. Then came function template instances with template arguments.

The stabs format for debugging information is basically a large array of fixed-size records. It has one enum, a few small integer arguments, and one string. So when extensions were added, the string just kept getting uglier and uglier. Here’s an example from the stabs document:

template int tfex( B* x, A y ) { ... }

This turns into:

.stabs "__1cEtfex3CTACTB_6Fp10_i_:YTf
A:tYC(0,19);B:tYC(0,20); // define template params A and B
@;;__1cEtfex3CTACTB_6Fp10_i_
:T(0,3);(0,21)=*(0,20);(0,19); // returns int, gets (B*, A)
@;
x:p(0,21);y:p(0,19);// name function params x and y
@;1;1;",// line number of template source.
N_GSYM,0x0,0x0,0x0// rest of stab

Of course in the wild it just looks like:

(CSS aside: Boy, getting this long line to be wrapped by the browser to fit inside the width of the current page seems to be impossible! How disappointing. Maybe the roller blog system will pretty it up for me.)

.stabs "__1cEtfex3CTACTB_6Fp10_i_:YTfA:tYC(0,19);B:tYC(0,20)
;@;;__1cEtfex3CTACTB_6Fp10_i_:T(0,3);(0,21)=
*(0,20);(0,19);@;x:p(0,21);y:p(0,19);@;1;1;",N_GSYM,0x0,0x0,0x0

(Nope, I had to put line breaks in)

The things that look like (0,3) are references to types that might be defined in other stabs. If you want to know how dbx makes sense of all that goop, just take a look at the stabs document. All the really juicy bits are described under the Symbol Descriptors section.

If you want to see what the function name looks like in the source program, you can demangle it by running the output through c++filt.

% echo __1cEtfex3CTACTB_6Fp10_i_ | c++filt
int tfex<__type_0,__type_1>(__type_1*,__type_0)

You can see the stabs in a program by using dumpstabs a.out

dwarf

The dwarf format is more structured. You can see the output of dwarf using the dwarfdump command. Here is some example dwarfdump output:

% dwarfdump -a a.out
...
<1><  405>      DW_TAG_base_type
                DW_AT_name                  int
                DW_AT_encoding              DW_ATE_signed
                DW_AT_byte_size             4
...
...
<1><  713>      DW_TAG_class_type
                DW_AT_name                  c4
                DW_AT_decl_file             1
                DW_AT_decl_line             2
                DW_AT_byte_size             16
<2><  729>      DW_TAG_member
                DW_AT_name                  ii
                DW_AT_type                  <405>
                DW_AT_data_member_location  DW_OP_plus_uconst 0
                DW_AT_accessibility         DW_ACCESS_public
<2><  741>      DW_TAG_member
                DW_AT_name                  dd
                DW_AT_type                  <562>

It’s much easier to read the output of dwarfdump than the output of dumpstabs. The numbers like <405> point to other TAGs in the same dwarf file. So you can just search for that number in the file, to see what the type of that member is.

Implementing Dwarf support in the Sun compilers has occupied a good chunk of my time for the last several years (on and off), so you’re bound to hear more about it from me. Stay tuned.

Pretty printing C++ types with dbx

Tuesday, May 17th, 2005

Since Lawrence doesn’t work at Sun any more, I’ll swipe a blog entry of his to make sure it stays available.

A tip from Lawrence Crowl:

One of the problems with debugging C++ programs is that they have many user-defined types. The debugger typically does not know anything about those types, and so cannot provide any high-level printing of those types. Instead, the debugger prints the types’ representations.

For example, consider a simple C++ standard string.

    #include <string>
    #include <iostream>

    int main() {
        std::string variable( "Hello!" );
        std::cout << variable << std::endl;
    }

In dbx, the result of printing variable is:

    (dbx) print variable
    variable = {
        __data_   = {
            __data_ = 0x41260 "Hello!"
        }
        npos      = 0
        __nullref = 0
    }

Not nice.

The Sun dbx debugger provides a helpful facility for up-leveling the printout. This facility is called pretty printing.

We can help dbx by defining a pretty printing call-back function. In essence, we write a function that converts from our high-level type into a char*. Dbx will look for pretty printing functions with the name db_pretty_print and a first parameter that is a pointer to the high-level type. In our example, the function is:

    char* db_pretty_print( std::string* var_addr, int, char* ) {
        return const_cast< char* >( var_addr->data() );
    }

(The second and third parameters are not needed in this example.)

Now, with the -p flag on the dbx print command line, dbx calls the pretty-printing function and uses the result for its output.

    (dbx) print -p variable
    variable = Hello!

You can make pretty printing the default by executing

    dbxenv output_pretty_print on

either in the debugger or in your ~/.dbxrc. Once pretty printing is the default, you can print the type’s representation with the +p flag.

    (dbx) print +p variable
    variable = {
        __data_   = {
            __data_ = 0x41260 "Hello!"
        }
        npos      = 0
        __nullref = 0
    }

For more information, type “help prettyprint” within dbx.

See more source in dbx

Thursday, May 12th, 2005

Okay, so you use dbx from the command line. When dbx stops at a breakpoint, it tells you the source line where you stopped. Well that’s nice. But it’s usually not enough context to know where you really are. You’d like to see more of the source. You can use the ‘list’ command to show you the source from that line down, but how can you see the source above and below that line at the same time? Easy. Write a little script. I got this script from someone else in my group, so I can’t take credit for it, but now I use it all the time. Put this script in your .dbxrc file, and away you go:

li() {
   list $[$vlineno-5], $[$vlineno-1]
   kprint -n ">"
   list $vlineno
   list $[$vlineno], $[$vlineno+5]
}

Here’s what it looks like in use:

stopped in main at line 64 in file "Cdlib.c"
   64     if (argc == 1)
(dbx) li
   59     FILE *cd_file;
   60     cd_title *cd_p, *prev_cd_p, *cd_p_2;
   61     char cd_info_path[PATH_MAX], *lp;
   62     int tr;
   63
>   64     if (argc == 1)
   65       sprintf (cd_info_path, "%s/.workmandb", ...
   66     else
   67       strcpy (cd_info_path, argv[1]);
   68
   69     cd_file = fopen (cd_info_path, "r");
   70     if (cd_file == NULL) {

Getting the current line to line up with the others (because of the arrow) even when the first character might be a tab, is left as an exercise for the reader. 🙂 more later. I’ll be on vacation for a week.