Archive for the ‘Debugging’ Category

Stupid UNIX Tricks #1 : LANG and shell scripts

Saturday, April 14th, 2012

If you’ve been using UNIX systems for a while (including Mac OS X, Linux or anything else remotely similar) you might know about the LANG environment variable.  It’s used to select how your computer treats language-specific features.  You can find out more than you ever wanted to know by looking here:

Mostly it doesn’t make much difference in your life, except there are two commonly used default settings.  One common setting is LANG=C which enables some very old-fashioned standard-conforming details and allows an implementation to skip lots of fancy language processing code.  Another common setting is LANG=en_US.UTF-8.  That setting tells the various system functions in libc to expect strings to be in a rich text format.

On the systems I use, it seems like the default is en_US.UTF-8.  But I suspect that most people must have LANG=C somewhere in their 20-year old .login files, because I occasionally run into bugs where some script doesn’t work right unless you have LANG=C.

Here’s an example:

% mkdir test; cd test; touch Caa cbb
% export LANG=C
% echo [c-d]*
% export LANG=en_US.UTF-8
% echo [c-d]*
Caa cbb

So the range of characters from ‘c’ to ‘d’ includes the letter ‘C’ if you are in the en_US.UTF-8 locale.  Ugh.  It’s easy to get that wrong in your shell script someplace, and people do.

Here’s an easier way to show why that happens:

% mkdir test1; cd test1; touch a A b B c C;
% export LANG=C
% ls
A  B  C  a  b  c
% export LANG=en_US.UTF-8
% ls
a  A  b  B  c  C

So you can see the sort order of strings used by the ls command matches the character order that the shell uses to expand the character range construct of glob regular expressions.  I suppose it’s consistent.  But it’s one of the things that makes it a challenge to write shell scripts that are robust and portable to different user’s environments.

Visualizing dynamic library dependencies

Friday, May 22nd, 2009

Darryl Gove has been working on graphical display of shared library dependencies. It seems useful for performance analysis and debugging of dynamically linked applications.

He did one for StarOffice and for Firfox and Thunderbird.

Debugger Design

Friday, March 20th, 2009

I’ve spent a number of years in the dbx group at Sun, and over time you collect a lot of coulda-woulda-shoulda stories.  You know what I mean, “This code should really have been designed to do XYZ.”  Or “This module shouldn’t have to talk to that module.”  I figured I’d try to record some of the interesting bits for posterity, so I wrote an essay that I vaingloriously call a whitepaper.  So without further ado:

Debugging tips for threaded programs on Solaris

Friday, June 1st, 2007

Phil Harmon wrote a blog entry over a year ago ( Solaris Threads Tunables ) where he mentioned a list of tunable parameters that you can use to fiddle around with the implementation of Solaris libthread. You can fine tune the spin-lock timeouts, and other timing details. But one of the flags that he mentioned is NOT related to tuning libthread. It’s more related to debugging your program! Someone on our internal dbx-interest alias asked why their program (which had a bug) was acting different when run under dbx, and the answer turns out to be related to a “sync tracking” flag that dbx turns on by default. It causes somewhat stricter checking of mutex bugs to be turned on.

Anyway, it turns out that if you set the environment variable _THREAD_ERROR_DETECTION to 1 or 2 you can get an extra level of error checking enabled inside libthread. 1 produces warning messages, and 2 produces warning messages and a core file for inspection.

The messages look like this:

*** _THREAD_ERROR_DETECTION: lock usage error detected ***
mutex_lock(0x8047d50): calling thread already owns the lock
calling thread is 0xfeea2000 thread-id 1

Most of the implementation is in libc/port/threads/assfail.c

Graphical display of thread synchronization

Monday, September 11th, 2006

One of the things that’s really hard about debugging threaded programs is tracking down which threads own which locks, and figuring out which locks they are supposed to own. In other words, synchronization bugs.  The most difficult symptom to debug is data corruption, because it’s very hard to track down exactly where things start to go wrong. In those cases where your program actually ends up in a deadlock, you get start with a smoking gun, and work from there. Much easier.

One way to find synchronization bugs in your program is to use Sun’s new Data Race Detection Tool you can find a preview version of that tool in Sun Studio Express 2.

Another way to hunt for bugs is to use dbx’s built-in synchronization debugging commands.  You can list all the locks in your program, and find out which threads own them, and which threads are waiting on them.

Here is some output from my dining philosophers program:

(dbx) syncs
All locks currently known to libthread:
forks (0x00021670): thread  mutex(locked)
forks+0x18 (0x00021688): thread  mutex(locked)
forks+0x30 (0x000216a0): thread  mutex(locked)
forks+0x48 (0x000216b8): thread  mutex(locked)
forks+0x60 (0x000216d0): thread  mutex(locked)
foodlock (0x00021708): thread  mutex(unlocked)

(dbx) sync -info 0x00021670
forks (0x21670): thread  mutex(locked)
Lock owned by t@2
Threads blocked by this lock are:
        t@6 a l@6 philosopher() sleep on 0x21670 in __lwp_park()

Okay, that’s fine. But obviously I need to pull out a pad of paper and start drawing boxes if I want to see where my bugs are. Of course, there are tools for drawing boxes and arrows, and all you have to do to use them is to convert your data into XML.

So I wrote a little ksh script and a little perl script, and presto, instant pictures. Well, I had to download a graph editing/layout tool … and I had to learn how to use it. But that wasn’t so bad.

When I first ran my dining philosophers program, now don’t laugh, I didn’t actually unlock my eating utensils. I wrote the unlocks, but then I rearranged some stuff, and they got dropped on the floor. So the first time I ran it, I got a deadlock.

My original plan was:

  • write functioning dining philosophers program
  • inject artificial bug
  • write graph utility
  • write blog

I ended up executing a slightly different plan:

  • write buggy dining philosophers program
  • get deadlock
  • write graph utility
  • fix dining philosophers program
  • write blog

Anyway, here is the picture that resulted from my deadlocked program. The graph edge with the same source/destination node is a dead give away. 🙁 The t@2 names are the dbx names for the threads.  “forks” is the name if the array variable that holds the locks representing the eating utensils for the dining philosophers. So “forks” is one lock, and “forks+0xNN” is another lock. (see source code link below).

sync graph with bug

sync graph with bug

Here is a picture when the program is working right. In other words, after I added my missing unlock statements.

sync graph for working program

sync graph for working program

The ksh function is copied into the comments at the top of the perl script. So to run this demo yourself, here are the installation instructions.

  • download dine.c
  • download syncs_to_graphml
  • install syncs_to_graphml somewhere on your search path (or edit the ksh script to find it)
  • copy the ksh script out of the comments in the perl script and into your ~/.dbxrc file
  • download the yEd program, and get it up and running (written in java, so it’s easy to set up)

To run the demo:

  • compile dine.c with -g and -lpthread
  • load it into dbx
  • run it, and stop the program in the middle (ctrl-C)
  • use the new syncgraph command inside dbx
  • load the output file /tmp/syncgraph.graphml into the yEd editor
  • use Tools -> Fit Node To Label (hit OK)
  • use Layout -> Classic (hit OK)

At that point, you should get a picture of the threads and their locks.

At that point you can play around with the various layout options for arranging the nodes in the graph.  Don’t be annoyed at all the little properties and numbers you can set.  I just ignore those most of the time. You can also export the image as jpg, pdf, etc.

debug info in XML, and DSD 2.0

Sunday, September 18th, 2005

I’ve been working in my spare time on the idea of converting dwarf debugging information into XML so that I can format it as XHTML using a stylesheet, and so I can check it using a Schema of some sort. When I started fiddling today I assumed that using a DTD was the way to go, that what one does with XML, no? Well after banging my head against the DTD format for a while, and looking for help on (our friend) the internet, I stumbled across a more general description of different ways to write XML schemas alternatives. Notice the lower case schema there. One of the several ways is called simply “XML Schema”, which is an alternative to DTD and DSD etc. Don’t get confused yet, you just started reading.

I had to kick myself in the head again tonight. Every time I get really stumped on trying to find good information on the internet, I end up realizing that everything I wanted was already there in Wikipedia. In the really confusing situations where I’m jumping in the deep end of the pool, all I really need to start me out is a two page summary of the state of the art. something to put all the technology jargon into context for me. But I still haven’t learned to look on Wikipedia first. *kick* *ouch*

God, lists look ugly in the hacked theme I’m using. I should fix them up one day. Anyway, here is good info on the different ways to formally describe your XML so that you can check it, and make sure it doesn’t have bugs.

From oldest and klunkiest to newest and hottest, the different languages are: 1) DTD 2) “XML Schema” 3) DSD 2.0

Note: Clayton Wheeler also pointed me at Relax NG, which is much more elegant.