One of the things that’s really hard about debugging threaded programs is tracking down which threads own which locks, and figuring out which locks they are supposed to own. In other words, synchronization bugs. The most difficult symptom to debug is data corruption, because it’s very hard to track down exactly where things start to go wrong. In those cases where your program actually ends up in a deadlock, you get start with a smoking gun, and work from there. Much easier.
One way to find synchronization bugs in your program is to use Sun’s new Data Race Detection Tool you can find a preview version of that tool in Sun Studio Express 2.
Another way to hunt for bugs is to use dbx’s built-in synchronization debugging commands. You can list all the locks in your program, and find out which threads own them, and which threads are waiting on them.
Here is some output from my dining philosophers program:
All locks currently known to libthread:
forks (0x00021670): thread mutex(locked)
forks+0x18 (0x00021688): thread mutex(locked)
forks+0x30 (0x000216a0): thread mutex(locked)
forks+0x48 (0x000216b8): thread mutex(locked)
forks+0x60 (0x000216d0): thread mutex(locked)
foodlock (0x00021708): thread mutex(unlocked)
(dbx) sync -info 0x00021670
forks (0x21670): thread mutex(locked)
Lock owned by t@2
Threads blocked by this lock are:
t@6 a l@6 philosopher() sleep on 0x21670 in __lwp_park()
Okay, that’s fine. But obviously I need to pull out a pad of paper and start drawing boxes if I want to see where my bugs are. Of course, there are tools for drawing boxes and arrows, and all you have to do to use them is to convert your data into XML.
So I wrote a little ksh script and a little perl script, and presto, instant pictures. Well, I had to download a graph editing/layout tool … and I had to learn how to use it. But that wasn’t so bad.
When I first ran my dining philosophers program, now don’t laugh, I didn’t actually unlock my eating utensils. I wrote the unlocks, but then I rearranged some stuff, and they got dropped on the floor. So the first time I ran it, I got a deadlock.
My original plan was:
- write functioning dining philosophers program
- inject artificial bug
- write graph utility
- write blog
I ended up executing a slightly different plan:
- write buggy dining philosophers program
- get deadlock
- write graph utility
- fix dining philosophers program
- write blog
Anyway, here is the picture that resulted from my deadlocked program. The graph edge with the same source/destination node is a dead give away. 🙁 The t@2 names are the dbx names for the threads. “forks” is the name if the array variable that holds the locks representing the eating utensils for the dining philosophers. So “forks” is one lock, and “forks+0xNN” is another lock. (see source code link below).
sync graph with bug
Here is a picture when the program is working right. In other words, after I added my missing unlock statements.
sync graph for working program
The ksh function is copied into the comments at the top of the perl script. So to run this demo yourself, here are the installation instructions.
- download dine.c
- download syncs_to_graphml
- install syncs_to_graphml somewhere on your search path (or edit the ksh script to find it)
- copy the ksh script out of the comments in the perl script and into your ~/.dbxrc file
- download the yEd program, and get it up and running (written in java, so it’s easy to set up)
To run the demo:
- compile dine.c with -g and -lpthread
- load it into dbx
- run it, and stop the program in the middle (ctrl-C)
- use the new syncgraph command inside dbx
- load the output file /tmp/syncgraph.graphml into the yEd editor
- use Tools -> Fit Node To Label (hit OK)
- use Layout -> Classic (hit OK)
At that point, you should get a picture of the threads and their locks.
At that point you can play around with the various layout options for arranging the nodes in the graph. Don’t be annoyed at all the little properties and numbers you can set. I just ignore those most of the time. You can also export the image as jpg, pdf, etc.