Archive for the ‘OpenSolaris’ Category

Goodbye Solaris 9 (for Sun Studio)

Wednesday, August 20th, 2008

We’re making the internal transition to building Sun Studio on Solaris 10 (instead of Solaris 9). This is a big deal because the product bits immediately become useless on any Solaris 9 system. There’s a new libm.so.2 library that became available on Solaris 10, and if you depend on it, you can’t run on Solaris 9. It’s a challenge making sure our vast ocean of loosely maintained lab machines is ready for the change. The good news is we get to use newer, faster hardware. 🙂

I’ll make this post short because I’m using ScribeFire for the first time in forever, and I don’t trust it. I can’t believe blogging is still this hard. 🙁

SXDE 5/07 — Solaris Express Developers Edition

Tuesday, June 12th, 2007

A new update to Solaris Express Developer Edition is out. It has the latest and greatest Solaris Nevada build, along with all the development tools you could ever want. Well, I dunno, I can want a lot, but there’s a bunch of tools in there. My favorite parts are Sun Studio 12 and Solaris Nevada, but on the same DVD you get Netbeans 5.5 fully loaded, including the Creator RAD web development tool and other goodies.

I’ll take this opportunity to review some of my own personal favorite features of Sun Studio 12 and Solaris Nevada.

Sun Studio 12

The part of Sun Studio 12 that I was most directly involved with recently is the compiler support for the dwarf debugging format. We finished support for C++ and Fortran using dwarf, and now it will be much easier to maintain compatibility between Sun tools and the gcc tool suite. Dwarf also has “location lists” which we can use in the future to support local variables in optimized code. We were hoping to get this feature in for Sun Studio 12, but it didn’t make it in time. Look for optimized locals support in a future Sun Studio Express release.

One of the most visible changes in Sun Studio 12 is the new IDE. The debugger GUI has been rewritten to use the latest Netbeans, and to be much more like Visual Studio in some ways. Boy you should have heard the discussions we had about that! But it turns out that Visual Studio seems to be geared towards beginning developers, and that’s something we need to get better at. There are lots of changes in the new version, and lots of things for us still to implement, but give it a try, and let us know what you think.

Other major new features include a new Thread Analyzer tool for finding data races in threaded programs, and the Linux version of our product now includes all the compilers! Having Linux compilers allows more people than ever to give Sun Studio a try, and see what they’ve been missing. Especially a “real” Fortran compiler!

Solaris Nevada

A lot of the Nevada features I like are also in Solaris 10 by now, but I’m not sure which. I’ve been starting to use ZFS filesystems on my group’s file server, and I just remembered one of my favorite new features. It’s the -h option. That’s it, just one option. In reconfiguring filesystems, I end up running ‘du’ and ‘df’ commands over and over and over. It’s always a pain in the butt because the output of those commands is normally in blocks. What the hell is a “block”? I never quite trusted the OS to give me an implementation independent answer. But the -h option stands for “human” format. You get size numbers displayed as 27.3K for kilobytes or 52G for gigabytes, and the number is rounded to an appropriately small numbr of significant figures. I never realized how annoying those block counts and sig figs were until I started to get used to the -h output.

I also like a lot of the Gnome desktop features in Solaris. There’s a feature called gDesklets that I was reading about the other day. I had them confused with panel applets (which are also cool), but desklets let you put clever little clocks or graphs on your background in a similar fashion to what Apple popularized with their “Gadgets” feature. You can code your gDesklets in a scripting language (python), so they should be easy to create.

Another thing that comes to mind is the remote desktop support in Gnome (aka Vino). Lots of people where I work use VNC for working from home. But if you go back and forth between home and office, it’s not really seamless to use the vncviewer program when you’re already on your local desktop. So people will shut down all their apps when they go back and forth. There’s a ‘remote desktop’ feature that allows you to start up a VNC session and transfer all your desktop windows into the remote server. Then when you shut down the VNC session they go back to being local windows. That feature makes moving between the office and another location so much easier.

Of course, these are the just some of the features. The new Network Automagic configurator, wider support for wireless drivers, Gnome 2.16, etc. For a list of lots more new stuff in SXDE, look here.

Debugging tips for threaded programs on Solaris

Friday, June 1st, 2007

Phil Harmon wrote a blog entry over a year ago ( Solaris Threads Tunables ) where he mentioned a list of tunable parameters that you can use to fiddle around with the implementation of Solaris libthread. You can fine tune the spin-lock timeouts, and other timing details. But one of the flags that he mentioned is NOT related to tuning libthread. It’s more related to debugging your program! Someone on our internal dbx-interest alias asked why their program (which had a bug) was acting different when run under dbx, and the answer turns out to be related to a “sync tracking” flag that dbx turns on by default. It causes somewhat stricter checking of mutex bugs to be turned on.

Anyway, it turns out that if you set the environment variable _THREAD_ERROR_DETECTION to 1 or 2 you can get an extra level of error checking enabled inside libthread. 1 produces warning messages, and 2 produces warning messages and a core file for inspection.

The messages look like this:

*** _THREAD_ERROR_DETECTION: lock usage error detected ***
mutex_lock(0x8047d50): calling thread already owns the lock
calling thread is 0xfeea2000 thread-id 1

Most of the implementation is in libc/port/threads/assfail.c

MT programming just got easier. (vfork)

Monday, February 12th, 2007

At least on Solaris. So far. Of course, in the UNIX world good API’s tend to get migrate into other implementations, so I hope Linux hackers will take note.

6497356 fork extensions (PSARC 2006/659)

The vfork call in UNIX has always been a serious issue when you’re using threads, and the posix_spawn API did a good job of addressing that.  You can read more about this issue in one of my previous blogs.

One of the remaining major problems in trying to use fork or posix_spawn in an MT process was how to deal with the SIGCHLD signal in a sane way.  One of our resident kernel hackers has addressed that issue with some new extensions to the fork() system calls.  You can read more about it in the bug report.  In addition to causing problems in threaded programs, it made it hard to use fork/exec inside a shared library in a way that was transparent to the main application.

Basically, if you don’t want a SIGCHLD, you can opt out of the signal and then use an explicit wait() call to reap the child process.  Simple.  Of course, there are multiple fork routines, as well as the posix_spawn interface, and they all need an additional flag.

Patch Check Advanced

Wednesday, November 22nd, 2006

I tried out the wizzy new Solaris patch GUI (updatemanager) a while back to see whether it was actually usable, and I had issues with it.  First of all it was really, really, slow when it had to analyze the currently installed patches.  Like it was so slow, you couldn’t tell whether it was hung or not.  So today I figured I’d try out a perl script that I ran across called pca. It’s called Patch Check Advanced. Because it’s a relatively simple perl script, I think it will have a much better chance of running consistently on the Solaris 8, 9 and 10 boxes we have around here. I just installed Sun Studio 11 on my Sparc machine (running a fairly ancient Nevada build, 41).  Of course, I didn’t have the patience to download a half a gigabyte over the internet, so I scrounged up a copy on our network and used that.

You can run pca as a non-root user to examine the current state of the machine, and then su to root and have it automatically update your Sun Studio installation.  You can use it to list only the Sun Studio patches. It’s a little weird because the pca script lists patches by default, and it says that the “-l” option is the default. But I got a different list of patches between doing “pca” and doing “pca -l”.  It turns out the x86 patches will be filtered out by “pca”, but won’t be filtered out by “pca -l”. So I selected the Sparc patches by using the patch description.  It turns out that Sun Studio patches are now named consistently so SPARC patch start with “Sun Studio 11:” and x86 patches start with “Sun Studio 11_x86:” So to list all the latest Sun Studio patches on a Solaris machine, I used this command:

% pca -l '/Sun Studio 11:/'
Download xref-file to /var/tmp/patchdiag.xref: done
Using /var/tmp/patchdiag.xref from Nov/21/06
Host: steppe (SunOS 5.11/snv_41/sparc/sun4u)

Patch  IR   CR RSB Age Synopsis
------ -- - -- --- --- -------------------------------------------------------
120760 -- < 11 ---   6 Sun Studio 11: Compiler Common patch for Sun C C++ F77 F95
120761 -- < 02 --- 145 Sun Studio 11: Patch for Performance Analyzer Tools
121015 -- < 03 ---  14 Sun Studio 11: Patch for Sun C 5.8 compiler
121017 -- < 06 ---  14 Sun Studio 11: Patch for Sun C++ 5.8 compiler
121021 -- < 05 ---  36 Sun Studio 11: Patch for Fortran 95 Dynamic Libraries
121019 -- < 03 ---  71 Sun Studio 11: Patch for Fortran 95 8.2 Compiler
121023 -- < 03 ---   6 Sun Studio 11: Patch for Sun dbx 7.5 Debugger
121623 -- < 02 --- 145 Sun Studio 11: Patch for RHEL4 and SuSE9 Linux Performance Analyze
122135 -- < 02 ---  43 Sun Studio 11: Patch for Sun Performance Library
122142 -- < 02 ---   6 Sun Studio 11: Patch for dbx GUI plug-in and CPP modules

From this list you can see one Linux patch (which is just a freshened RPM, not really a “patch”).  I don’t think the sunsolve patch index data has a field to identify non-Solaris patches.  We should probably add that so that tools can skip such patches. You can see from “– < 11” part that pca is telling me I don’t have any patches installed and that hence my current revision level is less than (<) the revision available from sunsolve.  Here is what it looked like after I updated:

bash # pca -l '/Sun Studio 11:/'
Download xref-file to /var/tmp/patchdiag.xref: done
Using /var/tmp/patchdiag.xref from Nov/21/06
Host: steppe (SunOS 5.11/snv_41/sparc/sun4u)

Patch  IR   CR RSB Age Synopsis
------ -- - -- --- --- -------------------------------------------------------
120760 11 = 11 ---   6 Sun Studio 11: Compiler Common patch for Sun C C++ F77 F95
120761 02 = 02 --- 145 Sun Studio 11: Patch for Performance Analyzer Tools
121015 03 = 03 ---  14 Sun Studio 11: Patch for Sun C 5.8 compiler
121017 06 = 06 ---  14 Sun Studio 11: Patch for Sun C++ 5.8 compiler
121019 03 = 03 ---  71 Sun Studio 11: Patch for Fortran 95 8.2 Compiler
121021 05 = 05 ---  36 Sun Studio 11: Patch for Fortran 95 Dynamic Libraries
121023 03 = 03 ---   6 Sun Studio 11: Patch for Sun dbx 7.5 Debugger
121623 -- < 02 --- 145 Sun Studio 11: Patch for RHEL4 and SuSE9 Linux Performance Analyze
122135 02 = 02 ---  43 Sun Studio 11: Patch for Sun Performance Library
122142 02 = 02 ---   6 Sun Studio 11: Patch for dbx GUI plug-in and CPP modules

As you can see, the Linux patch didn’t get installed, but it’s still listed.

To update my Sun Studio installation, I used this command:

# pca -G -i '/Sun Studio 11:/'

Don’t forget to add the -G option on Solaris 10.  This just passes -G to the patchadd command happening under the covers.  It’s necessary with Sun Studio patches on Solaris 10 because of a bug relating to zones. I thought I would have to configure my sunsolve name/password in there somewhere, but it seemed to work anyway.  I’ve probably wired those settings into a config file someplace and forgot about them.  I know I configured the updatemanager with that information, so maybe the pca script is using a Solaris utility that’s layered on top of some other utility that knows my name/password.

I’ve been thinking about the patch management issue for a while.  As far as I’m concerned Linux has us totally beat in this area.  The majority of software that’s “part” of a Linux system isn’t installed by default, and you just choose it from a GUI to download and install it.  Updates are handled with the same infrastructure.  On the other hand Solaris has all sorts of wonderful network based install/maintenance tools (Live Upgrade, etc) geared towards enterprise users. Those things have absolutely no bearing on my life whatsoever.  I need something trivial and ubiquitous and point-and-shoot.

Aside: Computer companies have always gone out of business from the bottom up.  I hope Sun doesn’t use all our wonderful Enterprise features as an excuse to ignore the desktop and small-business users of the world. The mainframe computer companies in the 80’s had their users taken away by PC’s that were “good enough” for small Mom and Pop businesses.  Of course, when Mom and Pop want to upgrade, they would naturally request new features from the PC vendor, instead of hiring an IT consultant and “going enterprise”.  It’s sort of like that picture of a fish eating a littler fish, and simultaneously being eaten by a bigger fish, only the market is complex enough that it’s more like a circle. In the computer biz everyone frantically trys to out-innovate each other.  As long as Sun’s chasing more than we’re being chased, I think we’re okay.  (I don’t mean ‘chasing’ as in playing catch-up, I mean chasing, like trying to take someone else’s market away from them by building new stuff) Anway, I get worried whenever I see a company concentrating on “enterprise” customers and ignoring all the little guys who will become enterprise customers in 5 years time.

Large established companies that are willing to try a revolutionary new technology seem few and far between, if you’ve got hot new ideas to show around, you want to start with the hobbyists and the little guys out there. That’s the lesson I’ve learned from watching Linux.

malloc interposition can’t possibly work reliably

Thursday, October 5th, 2006

In Solaris, many of the routines called from libc are “direct bound” so that references from inside libc will always find the function implementations that are inside libc. This approach prevents outside libraries from interposing (substituting) different implmentations of common functions. The largest exception to this is the malloc family of routines. The malloc routines (when called by libc, for example from strdup) MUST call an externally supplied malloc routine, if one is supplied via LD_PRELOAD or library link ordering.

There is a huge gotcha related to malloc interposing. If you get a pointer from malloc, you have to free it using the free routine in the same library that allocated it. But how do you guarantee that? If every program has libc, then every program will have at least one allocator in it. Any program that uses libumem will have at least two (one from libumu and one from libc). If the user wants to LD_PRELOAD their own memory checker library, it just gets worse.

It gets even worse because malloc libraries implement many additional routines to allocate memory. Let’s say my app calls valloc in libc. Let’s say I want to interpose libmymalloc because it has a spiffy new memory checker that I want to use. Now let’s say libmymalloc doesn’t include a definition for valloc. My app will crash, because valloc gets memory from the libc pool, and free will free it to the libmymalloc pool.

At this point there are people who will say: “Easy, just make sure they all implement the same set of functions.” Well yes, that would solve the problem, if there were a way to do this. But there is no standard for what this list of routines is. Memory allocation libraries are useful because they offer additional functionality beyond the plain malloc and free in libc. So they will always be adding functions that are not in anyone else’s implementation. If I write my app to use libmalloc_alpha.so, and someone interposes libmalloc_beta.so, then all the custom functions in libmalloc_alpha.so that I was calling will still go to libmalloc_alpha.so, but all the customary ones will go to libmalloc_beta.so. The result is undefined.

Unfortunately the idea that you can replace a memory allocator library by just interposing a different one is a widely known “fact”. You can read about how this bit the Solaris libraries in bugid 4846556. The problem came up in comp.unix.solaris recently as well.

Enterprise versus The Developer

Monday, September 18th, 2006

Note: I’m just an engineer at Sun.  What follows is my own personal perspective, and not to be taken as Sun’s official opinion in any way.  To the best of my knowledge I’m not giving away any trade secrets, but I am speaking frankly about Sun’s business model.

I attended an all-hands meeting with Rich Green (Sun’s head of software) today, and there was some discussion at the end about Sun’s approach to the desktop market.  During the discussion I got bitten by a sudden perspective. That happens to me a lot, but I don’t often take the time to write up my perspective or try to communicate it to people.  This time I figured I’d share my ideas.

I guess you could summarize the whole rest of this essay by saying that I believe in the long run  the hearts and minds of the software developer community at large will be won or lost on the basis of the desktop. To understand what I mean by this, read on.

I use Windows at home.

As a Sun employee, I’ve had the usual dilemma for many years now. Should I run Solaris at home? I have a computer, I know how to do my own administration. I could run anything I wanted to. Solaris, Linux, Apple, Windows, whatever.  Today I’m running Windows at home because I don’t have the time or energy to maintain more than one computer, or to maintain more than one operating system.  And windows runs the software I want to run. Games, Productivity software, random internet crap. (No snide comments about viruses please)  I have the same problems as 100 million other people, and when I have a problem, I just google “<my problem>” and up pops the answer.

So how does this relate to Sun and Sun’s business?  Well, I’m getting to that.  During this discussion today about the future of Sun in the desktop market, I was listening at home on my Windows box.  But I’d spent the entire day earlier developing Solaris software.  With my Windows box.  I count myself as an engineer.  Most days, I spend most of time concerned with code.  But the vast majority of tasks I do can be done natively on Windows.  I read and write email. I update internal and external wiki sites. I browse Sun’s internal web. I update bugs using a Java bugster client. I read PDF specs. I use term windows to log into Solaris machines to build things and reproduce bugs, and do other tasks.

Of course, for more intense code hacking, I need XEmacs with local NFS access to my sources, so there are some things I can’t do from home. But I could do them just fine from a Windows machine at work, if I bothered to take my laptop to work.  I was operating today in what I call “hybrid developer” mode. Using one desktop OS to develop software for another OS.

Sun does Enterprise.

Sun has enterprise class hardware, lots of big iron.  Sun has an enterprise class OS, Solaris.  Sun has an enterprise class software stack with open standards based servers.  Sun’s business seems to be totally oriented towards feeding large IT departments, telcos, banks, etc what they need.  Big iron.  But what about Sun’s smaller rack-mount systems, you ask? And what about Sun’s desktop machines? In my opinion, Sun’s smaller hardware boxes are essentially spin-offs to capitalize on technology that we developed for enterprise-class machines and software.  Inside Sun we’re focussed on the customers who buy our stuff.  (As a stock holder, I’m very pleased to see this!) But it represents a bias in our thinking, and a bias in Sun’s internal engineering culture.

I love Solaris, but I’m not an admin.

So here my deal. I’m a Solaris developer. Wait, let me be more clear. I’m a developer of software that RUNS on Solaris.  I am NOT a developer OF Solaris. I’m not in the kernel group. I’m not in the desktop group.  My job doesn’t require me to run BFU to install the latest nightly build of Solaris.  I can do Solaris system administration if I need to, but I don’t do it for fun, and it’s not part of my job.  I love using Solaris, but I don’t love administering it. Solaris is pretty damn painful to administer compared to a desktop OS.  But that’s not a fair comparison because Solaris is totally geared towards enterprise users, and not towards desktop users.

Software Updaters.

One area of functionality I wanted to talk about is web-based software installers and updaters. There are two groups of people I want to talk about here. Each group sees a problem, and is trying to solve it. And each group thinks the other group’s problem is the same as theirs. (Actually, these groups are imaginary groups, because I’m really talking about perspectives and not individual people.  The perspective differences in Sun cause language problems, communication breakdown and lack of synergy between groups.)

One group has the enterprise perspective.  It’s focused on things like smpatch and updatemanager for delivering Solaris patches. One group has the desktop perspective.  It’s focused on things like blastwave.org and pkg-get for delivering things like Ruby compiled for Solaris x86. The desktop perspective says: Linux uses apt-get (or red carpet or whatever) to update application software and OS packages both, why can’t Solaris just convert to using some better than patchadd and pkgadd? The enterprise perspective says: Well, once we get updatemanager up and running a little better, we can eventually start to include unbundled applications in our centrally controlled server-based software distribution model.  Both perspectives make perfect sense, but only if you’re looking in one specific direction.

How to get Developers.

Sun has several approaches that it could use to getting Solaris developers. I’ll list them in order of the impact they would have on developers: 1) Get Solaris on lots of desktops so that it will be a natural and easy starting point for new software development. 2) Embrace more fully the “hybrid” nature of much of today’s Solaris development. 3) Do whatever we can to encourage multi-platform projects to support and distribute software for as many flavors of Solaris as possible.

Option 1 puts the user dead center inside a rich environment of wonderful Sun technology like dtrace, ZFS, zones, Xen support. etc, etc. This will be contagious.  If you’re familiar with the Sun desktop, and you’re familiar with the Sun administration commands, your software will end up working better on Solaris, and you’ll be more likely to stick with Solaris. That’s good for Sun. Option 2 is one step removed. Hybrid development is what I do at home.  Use a Windows, or Linux, or OS X desktop, but develop software for Solaris. If you’re doing primary development on Solaris, your software will be more likely to run best on Solaris.  I might still be using Solaris as my primary development platform for the software that I build, but my daily interaction will be with a different OS. The one that runs on my desktop. Option 3 is another step removed.  Sun will benefit if more open source software is compiled and distributed for Solaris.  If Sun can make it easier to port software to Solaris, that’s a step forward, too.  And there’s plenty we could do to help that. All of these options are opportunities for Sun to focus on, if we really want to get more Solaris developers. Development tools support and basic OS support could be tuned towards supporting those kinds of users.

But it’s a slippery slope.

Having your OS be on the developer’s desktop is the core of getting a really healthy developer community. As the user gets more and more removed from Solaris, they start to see it as just another platform that they might or might not port to.

Getting started with dtrace

Friday, May 12th, 2006

Vijay forwarded me an email from Eugene and here’s what I wrote:

Eugene wrote:
I have a code and I'd like to figure out where the user code is
when brk is being called.  Can this be done with truss?  If it's
done with dtrace I need some serious handholding.  Canned
scripts (or whatever) would be nice since I need to resolve this
in a hurry.

Here is what I recommend:

1. download/untar the DTrace toolkit

2. run this command to see the stacks of all places that call brk

./DTraceToolkit-0.96/Bin/dtruss -s -t brk /bin/ls

Unless you give yourself dtrace permissions in /etc/user_attr, you will need to be root to run dtrace. The toolkit has a bunch of scripts in it that do wonderful things. And all without knowing anything about dtrace.


I went ahead and fiddled with dtrace for a bit, and here’s what I came up with:

% dtrace -n 'pid$target::sbrk:entry { @num[ustack()] = count()}' -c "find /usr -name 'xyzzy'"

dtrace: description 'pid$target::sbrk:entry ' matched 1 probe
^C
dtrace: pid 17201 terminated by SIGINT

              libc.so.1`sbrk
              libc.so.1`_morecore+0x24
              libc.so.1`_malloc_unlocked+0x1fc
              libc.so.1`_smalloc+0x4c
              libc.so.1`malloc+0x4c
              libc.so.1`calloc+0x58
              libc.so.1`textdomain+0x38
              find`main+0x1c
              find`_start+0x108
                1

              libc.so.1`_morecore+0xdc
              libc.so.1`_malloc_unlocked+0x1fc
              libc.so.1`_smalloc+0x4c
              libc.so.1`malloc+0x4c
              libc.so.1`calloc+0x58
              libc.so.1`textdomain+0x38
              find`main+0x1c
              find`_start+0x108
                1

The number after each stack shows the number of times that stack trace was encountered.

Oh yeah. Did I tell you how much I hate the wysiwyg editor I’m using in Roller? Of course, once I get the content in the little box, it’s too much trouble to change editors. I took me almost as long to get the preformatted text right as to write the email, figure out the script and write the rest of this blog.

Silly me. I forgot about using Xinha! It’s in my firefox, but I forgot all about it.  I’m using it to add this last paragraph, and it’s working fine.

How to use libumem to find a bad free call

Thursday, March 23rd, 2006

I have not seen any good simple tutorials on how to use libumem for debugging.  (Unless you also want to learn how to use mdb).  So I wrote a simple example.

% more t.c

#include 
#include 
int main()
{
    int i;
    free(&i);
    i = 10;
    char * p = (char *) malloc(1000);
}

This program has a bug, and it might crash or it might not. It might crash right away, or it might crash after running longer (if it had more code after the bug). Using libumem with default options, will cause more basic assertion checking.

% cc -g t.c
% a.out
% # notice no crash
% LD_PRELOAD=/lib/libumem.so ./a.out
Abort (core dumped)
% dbx -c 'where;quit' - core
Corefile specified executable: "/home/quenelle/./a.out"
Reading a.out
core file header read successfully
Reading ld.so.1
Reading libumem.so.1
Reading libc.so.1
Reading libc_psr.so.1
program terminated by signal ABRT (Abort)
0xff2c0f90: __lwp_kill+0x0008:  bcc,a,pt  %icc,__lwp_kill+0x18  ! 0xff2c0fa0
Current function is main
    8       free(&i);
  [1] __lwp_kill(0x0, 0x6, 0x0, 0x0, 0x0, 0x0), at 0xff2c0f90
  [2] raise(0x6, 0x0, 0x20f90, 0xff36b7cc, 0xff38a000, 0xff38abc4), at 0xff25fd78
  [3] umem_do_abort(0x4, 0xffbfe6c0, 0x6, 0x20ecc, 0xff37680c, 0x0), at 0xff3690fc
  [4] umem_err_recoverable(0xff377818, 0xa, 0x20dc4, 0xff38a6fc, 0xff38d0d0, 0xff377823), at 0xff3692a0
  [5] process_free(0xffbfe9d8, 0x1, 0x0, 0x3e3a1000, 0x1ee5c, 0x20c28), at 0xff36b2b0
=>[6] main(), line 8 in "t.c"

Abort (core dumped)

This trick can often be used to find the place where your malloc/free bug happened.  There are some environment variables you can use to control the behavior of libumem. You can read more about them in the umem_debug man page.  You can also find out more about libumem by reading the various white papers that are available.  You do a google search on “libumem” or “libumem solaris” to find out more.

simple umem integration with dbx

Monday, July 25th, 2005

The mdb debugger has some really nifty integration with libumem, as documented in this technical article, and Adam Leventhal’s famous Top 20 Blog.

I got a request recently asking if dbx had similar features. I think the engineer who asked was already familiar with the memory checking features dbx has (Run Time Checking), but the simplicity and speed of libumem checking have their own advantages.

So anyway, after a bunch of fiddling around in my C.S.T. (copious spare time) I got a ksh script together that I’m not completely embarrassed to share with the world. It has a few basic functions like:

  • Dump the transaction log
  • Show all the transaction log entries for a specific block, along with the stack trace for each log entry
  • Given any address, search the transaction log for any blocks containing that address

The dbx lib umem module has a simple demo script in the documentation and a few tech notes. Please try it out if you’re interested, and let me know how it goes. If you have trouble downloading it because it doesn’t have a .txt extension let me know.

Update:

There is a Solaris bugid asking for libumem to export this information in a stable form. It is bugid:
6297789 libumem could use a libumem_db

In my never ending battle to figure out Technorati, here we go again: