There has been some discussion where I work about what kinds of tools are good for supporting multi-threaded programming. There are lots of different kinds of tools and features that can help. My natural tendency is to try to create categories and taxonomies when I get a bunch if different-but-related ideas. So here is my attempt to give an overview of how a development tools chain can support threads from beginning to end.
MT Tools Taxonomy
For the sake of contemplation, you can divide software development tools into two categories. Some tools are used to create your program, and other tools operate on your program after it’s been created.
The primary tools in the first category are compilers and linkers. Also included are pre-processors (auto-parallelisers, etc.), static checkers (lint), etc. These tools don’t get to see the program in action. They don’t depend on the dataset or runtime environment of the program. They must assume that all possible code paths through the code are important and relevant. These are “compile-time” tools.
The primary tools in the second category are debuggers and performance tools. Also included are dynamic checkers (for synchronization, memory allocation, code coverage (like tcov), etc.). These tools observe the program in action. But they are only as good as the dataset you feed into your running application. (If your program is interactive, then the dataset is the pre-canned or user-performed set of actions that exercise your program when it runs under the tool.) These tools have the advantage and disadvantage that they can focus on frequently executed paths, and ignore code paths that are not executed in practice. These are “runtime” tools.
Because of the recent popularity of multi-core chips, tools to help create and maintain threaded applications are more important than ever. There are three aspects to this, decomposition (how to break your task into multiple threads) and implementing proper synchronization. There are few tools to help programmers effectively decompose their application into threads. Most people focus on trying to get the synchronization right, and most threading tools are created to help this kind of task.
Compile-time tools focus on correct code generation, and they can help the user implement correct synchronization in a variety of ways. Directives can be inserted by the user to help them explain to the tool chain how they intend the synchronization to work. Such directives can be implemented as pragmas or language extensions.
Assertion-style directives can be used by the compiler or an external tool to verify synchronization operations. For example, (assuming a structure type named ST) the user could declare that “lock ST.lk protects variable ST.data”. As with lint, the user will probably need ways to suppress specific warnings. Such suppression directives serve as red flags when it comes time to debug problems. This style of directive is better implemented a pragma instead of a language extension. It will be assumed that the presence or absence of these directives doesn’t affect program correctness.
Implementation-style directives can support higher level synchronization operations in the compiler. For example, “always protect variable ST.data with lock ST.lk”. The compiler would make sure to generate lock and unlock primitives to protect all accesses to the structure member named data. This kind of directive would be better implemented as a language extension, because it’s primary purpose is to drastically affect program correctness.
An example of what I mean by Implementation-style directives is the “synchronized” keyword in Java. This keyword supports a rich set of synchronization styles in Java.
Another example of Implementation-style directives is the OpenMP system. The primary purpose of the OpenMP directives is to specify the way work is broken down into chunks, which makes it different than the other tools I’m talking about here (which focus on synchronization implementation and checking).
Implementation-style directives will allow the compiler to optimize your code more effectively. Memory operations that don’t need to be synchronized can be moved outside of critical regions, and critical regions can be merged, split, and moved around much more effectively when the compiler knows exactly what semantics need to be preserved.
The barrier to entry will be high for using implementation-style directives because it causes your code to be non-portable until standards (either official or de facto) are created for such directives. It’s possible that you could omit these directives on a system that didn’t support them and create a properly functioning sequential program, but your code would have to be designed and implemented to operate correctly in both modes. Doing this is straightforward but non-trivial. It doesn’t come for free.
Other implementation-style directives can support a wide variety of high level synchronization mechanisms: Critical sections (a block of code protected by an automatically allocated global lock), synchronized structures and classes whose members are automatically protected by a per-object lock, synchronized classes in C++.
The more information the compile-time tools have, the faster and more correct the resulting program will be. If the tools support a wide variety of assertion-style and implementation-style directives, then the error checking they do will be more likely to find useful problems, and optimizations that they do will be better and more correct.
Runtime tools focus on analyzing, controlling and inspecting the behavior of a running program. They are less helpful during the creation phase of development, and more helpful in the debugging and tuning phases. Some runtime tools implement a harness that executes checking code along side the running program, and some tools collect event traces at runtime and do the analysis later. Runtime tools depend on significant support from the compilers. Source line information, type information, and compile-time instrumentation (for data collection) are widely used by runtime tools.
Runtime checking tools can help repair and improve threaded programs with a variety of thread-specific features. They can check for data races in threaded programs (either at runtime, or post hoc) by analyzing the memory accesses and lock operations performed by a test run of the program.
Performance tools can detect which locks are the most frequently accessed, which locks are most likely to cause a thread to block, which locks result in the longest total wait times. By doing this the tool can help you improve the performance of your synchronization. They can detect violations of any assertion-style directives that the user has put into the code.
Debugging tools can show details of running threads, which ones are blocked on which synchronization objects and which locks are held by a particular thread. Debuggers can allow you to suspend or resume individual threads in order to provoke certain conditions in your program.
As I said, which of these tools we might extend or develop is still being discussed around my part of Sun. If you have feedback on which areas need more attention, or what you think we should do, let me know.