Dwarf and XML
Tags: Compilers - dbxI’ve been having a hard time sifting through huge dwarf dump files in the last year or so, especially some of the huge dumps from the C++ standard template library. (Blech) So I’ve been working on a side project to let me do more powerful queries on dwarf information. The part of the dwarf information that I usually have to sort through is the .debug_info section. It’s essentially an abstract syntax tree of all (or part) of the information in the object file. In order to make it easier to sift through, I’ve started to write an XML dumper for this information, so that I get information something like:
<t:namespace id='1178'> <name string ='1'>std </name> <SUN_link_name string ='1'>__1nDstd_</SUN_link_name> <sibling ref4 ='1643'/> <!--__rwstd--> <t:structure_type id='1197'> <name string ='1'>char_traits<char></name> <SUN_part_link_name string ='1'>nLchar_traits4Cc_</SUN_part_link_name> <decl_file data1 ='3'/> <decl_line data1 ='182'/> <SUN_template ref4 ='1247'/> <!--char_traits--> <declaration flag ='1'/> <t:template_type_parameter id='1241'> <type ref4 ='883'/> <!--char--> </t:template_type_parameter> </t:structure_type>
Instead of the usual dwarfdump form, which is:
<1>< 1178> DW_TAG_namespace DW_AT_name std DW_AT_SUN_link_name __1nDstd_ DW_AT_sibling <1643> <2>< 1197> DW_TAG_structure_type DW_AT_name char_traits<char> DW_AT_SUN_part_link_name nLchar_traits4Cc_ DW_AT_decl_file 3 /set/c++/cafe8/mkapoor/lang5.9/libCstd.2.1.1/include/rw/traits DW_AT_decl_line 182 DW_AT_SUN_template <1247> DW_AT_declaration yes(1) <3>< 1241> DW_TAG_template_type_parameter DW_AT_type <883>
The XML format is still preliminary, but it lets me play around with using the XQuery language for searching the XML and extracting pieces of it. (I could also use XSLT, but XQuery is a little better for joins and more complex searches.) XQuery includes as a subset the XPath syntax. I’m sure all this is just a bunch of gobbledy goop unless you already know some of this stuff, so here is an example:
In XPath, you can select all the XML nodes in a document based on what their parents are, for example:
//namespace/struct
This XPath expression would select all the “struct” XML nodes that are children of “namespace” nodes.
Using XQuery I wrote a simple script to dig out all the elements with a specific name, and show the names of the containers that are their ancestors. The pathname to Mukesh’s source tree makes a featured appearance here because that’s where got my sample debug information from, it started while I was trying to track down a bug in the debug info for libCstd.
% ruby dwcmd.rb dwarf xgrep findname dw.xml __unLink <?xml version="1.0" encoding="UTF-8"?> /set/c++/cafe8/mkapoor/lang5.9/libCstd.2.1.1/include/string.cc - 11 std - 1120 basic_string<char,std::char_traits<char>,std::allocator<char> > - 1827 __unLink - 2455 /set/c++/cafe8/mkapoor/lang5.9/libCstd.2.1.1/include/string.cc - 11 std - 1120 basic_string<char,std::char_traits<char>,std::allocator<char> > - 1771 __unLink - 2201 /set/c++/cafe8/mkapoor/lang5.9/libCstd.2.1.1/include/ostream.cc - 11 std - 1121 basic_string<char,std::char_traits<char>,std::allocator<char> > - 2735 __unLink - 2926 /set/c++/cafe8/mkapoor/lang5.9/libCstd.2.1.1/include/ostream.cc - 11 std - 1121 basic_string<char,std::char_traits<char>,std::allocator<char> > - 2806 __unLink - 2997
As you can see, an item named “__unLink” shows up 4 times. I extended the script to allow you to filter which items you wanted to see based on the names of their containers. So when I search for “ostream:__unLink” the script will only show me items named __unLink that are within items that have “ostream” in the name.
% ruby dwcmd.rb dwarf xgrep findname dw.xml ostream:__unLink <?xml version="1.0" encoding="UTF-8"?> /set/c++/cafe8/mkapoor/lang5.9/libCstd.2.1.1/include/ostream.cc - 11 std - 1121 basic_string<char,std::char_traits<char>,std::allocator<char> > - 2735 __unLink - 2926 /set/c++/cafe8/mkapoor/lang5.9/libCstd.2.1.1/include/ostream.cc - 11 std - 1121 basic_string<char,std::char_traits<char>,std::allocator<char> > - 2806 __unLink - 2997
Pretty cool, huh?
Anyway, that’s as far as I got. There’s always more compiler bugs to fix, so I don’t get much time to work on infrastructure and internal tools. Maybe I’ll get some more hacking done over the holidays. XML feeds into some of my areas of technical curiosity, like RDF, RDFA, SPARQL, FOAF, etc.