Chris Quenelle is a tools developer at Oracle Corp. He's worked on performance and debugging tools at Sun and Oracle for over 15 years. He reads comic books and science fiction,and has more tivos than he can keep track of.

 

February 2012
SMTWTFS
«Jan  
 1234
567891011
12131415161718
19202122232425
26272829 

Using dbx and libumem to find memory problemsUsing dbx and libumem to find memory problems

Update:There is a new version of umem.dbx here (Solaris 9 support) Download it here.

I implemented a spiffy little dbx module to give basic access to libumem debugging features a while back,but I haven’t gotten much feedback on it.  Think of this blog entry like a Dunk Tank at the carnival. Throw a blog comment at me (like a bug or an RFE) and make me add some new features to my libumem dbx module.

Update March 9,2010:It looks like Eric dunked me good.  (See the comments).  This module may or may not work for you. But it’s still probably useful as an example of how to write advanced dbx scripts.

Here are things I know to be true:

  • developers prefer dbx to mdb (unless you’re hacking on the kernel)
  • the RTC feature in Sun Studio is slow and sometimes buggy
  • memory allocation and access bugs are NASTY to track down
  • nobody has requested any new features for the libumem module

So the only conclusion I can draw is that not enough people know about this module yet.  :-)   With that in mind,here is the command line version of the ever-popular screenshot.

(dbx) source umem.dbx(dbx) alias u=umem(dbx) u startEnabling libumem debugging(dbx) cc -g t.c(dbx) debug a.out(dbx) list 1,$  1  #include <malloc.h> 2  int main()  3 { 4  char * p; 5  p = malloc(1); 6  p = malloc(1); 7  p = malloc(1); 8  free(p); 9  p++; 10  // this free will cause an error in libumem  11  // if checking is on,because it's a bad free  12  free(p); 13  }  14(dbx) runsignal ABRT (Abort) in __lwp_kill at 0xff2bd5ec0xff2bd5ec:__lwp_kill+0x0008:bcc,a,pt %icc,__lwp_kill+0x18 ! 0xff2bd5fcCurrent function is main  12  free(p);(dbx) print pp = 0x5bfa9 "\xad\xbe\xef\xde\xad\xbe\xef\xfe\xed\xfa\xce\xfe\xed\xfa\xce"(dbx) u findblock 0x5bfa9Building umem_syms helper library.Address 0x5bfa9 is inside the umem block at 0x5bfa0.  This corresponds to the malloc block at 0x5bfa8.# So we can see that the pointer we tried to free points into# the middle of a block.# Let's ask for a history of the umem block that caused umem to barf.(dbx) u bhist 0x5bfe0=================================================================  Log Rec Addr   Block Addr  Thrd   Timestamp  ------------   ----------  ----   ---------  0x320c8   0x5bfa0  1 0xbda1d488b82600x107d0:in `a.out`_start  0x10c1c:in `a.out`t.c`main at "t.c":70xff36aeb4:in `libumem.so.1`malloc  0xff36e2d0:in `libumem.so.1`_umem_alloc  0xff36de8c:in `libumem.so.1`_umem_cache_alloc  =================================================================  Log Rec Addr   Block Addr  Thrd   Timestamp  ------------   ----------  ----   ---------  0x3212c   0x5bfa0  1 0xbda1d488ba0100x107d0:in `a.out`_start  0x107d0:in `a.out`_start  0x10c2c:in `a.out`t.c`main at "t.c":80xff36b214:in `libumem.so.1`malloc.c`process_free  0xff36dfec:in `libumem.so.1`_umem_cache_free  =================================================================# Don't ask me why _start shows up twice in libumem stack capture.# It's probably a stray tail-call optimization someplace.# If you want to see the recent history of allocations/frees,do this:(dbx) u log  Log Rec Addr   Block Addr  Thrd   Timestamp  ------------   ----------  ----   ---------  0x32000   0x5bfe0  1 0xbda1d488b15c8  0x32064   0x5bfc0  1 0xbda1d488b6fa0  0x320c8   0x5bfa0  1 0xbda1d488b8260  0x3212c   0x5bfa0  1 0xbda1d488ba010
http://quenelle.org/unix/wp-content/plugins/sociofluid/images/digg_24.pnghttp://quenelle.org/unix/wp-content/plugins/sociofluid/images/reddit_24.pnghttp://quenelle.org/unix/wp-content/plugins/sociofluid/images/dzone_24.pnghttp://quenelle.org/unix/wp-content/plugins/sociofluid/images/stumbleupon_24.pnghttp://quenelle.org/unix/wp-content/plugins/sociofluid/images/delicious_24.pnghttp://quenelle.org/unix/wp-content/plugins/sociofluid/images/furl_24.pnghttp://quenelle.org/unix/wp-content/plugins/sociofluid/images/technorati_24.pnghttp://quenelle.org/unix/wp-content/plugins/sociofluid/images/facebook_24.pnghttp://quenelle.org/unix/wp-content/plugins/sociofluid/images/twitter_24.png

3 comments to Using dbx and libumem to find memory problems

  • Eric Safern

    Hello,

    I’ve been playing with your libumem module for dbx –but no joy,so far.

    Here’s what I’ve got –a simple program that corrupts memory.

    When I use mdb,I get this:

    05:14 PM sundev19:c++ $ mdb a.out
    >::run
    leaking 20 bytes…
    leaking 20 bytes…
    leaking 20 bytes…
    leaking 20 bytes…
    leaking 20 bytes…
    mdb:stop on SIGABRT
    mdb:target stopped at:
    libc.so.1`_lwp_kill+8:bcc,a,pt %icc,+0×10
    mdb:You’ve got symbols!
    Loading modules:[ ld.so.1 libumem.so.1 libc.so.1 ]
    >::umem_log
    CPU ADDR BUFADDR TIMESTAMP THREAD
    1 0006a190 00155ef8 20ccc937bc8c0 00000001
    1 0006a12c 00155f28 20ccc574506a0 00000001
    1 0006a0c8 00155f58 20ccc1b139930 00000001
    1 0006a064 00155f88 20ccbdf763e00 00000001
    1 0006a000 00155fb8 20ccba3c47b10 00000001
    0002a1f4 00000000 0 00000000

    But when I run the same code with dbx and your module,the transaction log is supposedly empty!

    Poking around inside dbx,I pulled up this data structure,which looks fishy to me –but I’m no expert.

    What do you think?

    (struct umem_log_header *) umem_transaction_log = 0×28008
    (dbx 24) p * ((umem_log_header *) umem_transaction_log)
    *((struct umem_log_header *) umem_transaction_log) ={
    lh_lock ={
    flags ={
    flag1 = 4U
    flag2 = ”
    ceiling = ”
    mbcp_type_un ={
    bcptype = 0
    mtype_rcount ={
    count_type1 = ”
    count_type2 = ”
    }
    }
    magic = 19800U
    }
    lock ={
    lock64 ={
    pad = “”
    }
    lock32 ={
    ownerpid = 0
    lockword = 0
    }
    owner64 = 0
    }
    data = 0
    }
    lh_base = 0x2a000 “”
    lh_free = 0×2000
    lh_chunksize = 141733920769U
    lh_nchunks = 1
    lh_head = 0
    lh_tail = 262144
    lh_hits = 19800
    lh_cpu = (
    {
    clh_lock ={
    flags ={
    flag1 = 0
    flag2 = ”
    ceiling = ”
    mbcp_type_un ={
    bcptype = 0
    mtype_rcount ={
    count_type1 = ”
    count_type2 = ”
    }
    }
    magic = 0
    }
    lock ={
    lock64 ={
    pad = “”
    }
    lock32 ={
    ownerpid = 0
    lockword = 0
    }
    owner64 = 0
    }
    data = 0
    }
    clh_current = (nil)
    clh_avail = 0
    clh_chunk = 0
    clh_hits = 0
    clh_pad = “”
    }

    )
    }

    Thanks in advance for your help!

  • Chris Quenelle

    I’ll take a look,thanks for letting me know.

  • Chris Quenelle

    Well this problem has been on my to-do list for months now,and I haven’t found the time to investigate. I’m really sorry about that Eric. I may still revive this project some day,but it probably won’t be soon.

Leave a Reply

  

  

  

You can use these HTML tags

<a href=""title=""><abbr title=""><acronym title=""><b><blockquote cite=""><cite><code><del datetime=""><em><i><q cite=""><strike><strong>