Archive for the ‘Developer Tools’ Category

Virtualization Terms 2 – It’s all about the Hypervisor

Thursday, November 8th, 2012

A hypervisor is a layer of software that creates one or more virtual hardware systems. This allows multiple system images (operating system instances) to run at the same time (Wikipedia). Understanding how the hypervisor is implemented will let you predict the relative performance overhead compared to other forms of virtualization. It will also help you understand the features and limitations of each form. The table below sorts the different forms of virtualization from the more concrete (closer to the hardware) to the more abstract (further away from the hardware).

 

Product Names Location of HV Explanation
SunFire / M-Series
Dynamic System Domains
aka Hard Partitions
none HW routing to machine partitions
is done using HW only.
LDOMS aka
Oracle VM Server for SPARC
firmware
Xen / Sun xVM aka
Oracle VM Server for x86
kernel replacement aka Type 1
Linux KVM kernel integrated
Virtualbox / VMWare application aka Type 2
Solaris Zones aka Containers none HW resources are virtualized from the application
viewpoint, not from the kernel viewpoint.

Note: This is an update to a post two years ago.  This one supersedes the older post.

Is shell history an anachronism?

Friday, September 16th, 2011

I work in an environment that has user home directories shared over NFS.  I always thought that kind of made the normal shell history mechanism fall on it’s face.  None of the shells I’ve seen will actually do the hard work of synchronizing the shell history file to collect data from multiple different hosts in one file.  It even falls apart when you have multiple term windows open on one machine.  Many years ago I realized I didn’t want my shells writing frequently to my home directory over NFS, so I relocated my history file to /tmp.  This means I’ll get history restored when I log into the same machine (until it gets rebooted) but it’s put luck which session on the same machine saved it’s history last.  Bash is my normal shell these days, and it has a lot of features to tweak and manipulate the history, but none of the features seem to deal with the inherent sync issues.  I suspect everyone uses history within their current shell session, and nobody much cares if it is saved or not.

The reason I care is because I’m looking at using it as a platform to associate command history with logical projects.  It’s an interesting idea, but I’m surprised the whole mechanism is so poorly adapted to modern environments.

The everpresent “util” module.

Sunday, September 12th, 2010

Every major library or application I write seems to have a module named “util” these days.  I think it represents a kind of “impedance mismatch” between the platform I’m using (C runtime, C++ runtime, python standard libraries) and the platform I *wish* I were using.

Recently, I’ve been writing python code that runs lots of little UNIX utilities.  You know, like: find, ls, chmod, etc, etc.  It’s the kind of code that might also be written as a shell script, but python is much nicer when the program gets larger than about a page.  If you’re running lots of utilities, you want a variety of ways to interact with them.

Sometimes, you don’t want to send it any input, sometimes you do, sometimes you are expecting one line of output.  Sometimes you’re expecting a list of lines.  Sometimes you’re going to check the return code, sometimes you’re not.  These functions are all just small wrappers around calls to the python subprocess module in python.  But if you’re writing a lot of code that uses them, it’s important to make that code readable, so you want to streamline away most of the goop for dealing with the subprocess module.

I have utility routines for creating temporary files and removing them all when the program exits. There are routines to keep me from adding a lot of obscure import statements to the top of most of my modules.

Here’s some examples of what I’m using for now:

def gethostname():
   from socket import gethostname
   return gethostname()

def timestamp():
   import datetime
   return str(datetime.datetime.today())

Here’s a recipe that I got from stackoverflow.com.  I wanted the equivalent of “mkdir -p”, and you need a few lines to do that in python.

def mkdir_p(dir):
  import errno
  try:
    os.makedirs(dir)
  except OSError, exc:
    if exc.errno == errno.EEXIST:
      pass
    else:
      raise

There’s also code to do things that I’m convinced must have a better answer in python, but I haven’t found it yet.  So I isolate the hack to the until module.

def is_executable(file):
  S_IEXEC = 00100
  mode = os.stat(file).st_mode
  return mode & S_IEXEC

Moving code in and out of my util module also prevents me from worrying so much about obscure modularity issues. Any code I don’t want to worry about today goes into the util module. When I know where it belongs, I can easily move it later. Of course, that’s much easier to do with python than in a language that uses header files like C or C++.

On Iteration by Andrei Alexandrescu

Wednesday, June 23rd, 2010

I just finished reading a great article on iterators by Andrei Alexandrescu.  Mr. Alexandrescu is a contributor to the D programming language.  In this paper, he discusses the background of iterator implementations including C++ STL iterators, and then goes on to outline a new model for iterators.  It’s very readable, I recommend it.

http://www.informit.com/articles/article.aspx?p=1407357

To get a more readable all-in-one page, click on the “print” link on the page above, or go here:

http://www.informit.com/articles/printerfriendly.aspx?p=1407357

Charles Stross on EBooks

Friday, June 18th, 2010

I just read a nice essay by science fiction author Charles Stross about EBooks.  As usual, he presents a very lucid and entertaining look into the world of publishing.

CMAP #9: Ebooks

Virtualization terms

Wednesday, June 16th, 2010

Update: A newer version of this post (find it here) was recently created. 2

Okay, before I forget, I’m writing it all down.

We have to test against all this stuff, and it’s becoming more and more convenient to use virtualization as a way to share lab resources, so I figured I’d go make sense of all the terminology that’s flying around.  I understood 80% of it, but I could never understand all of it at once.  A lot of this was extracted from Wikipedia.

Here are the things that affect my life: Xen, VirtualBox, VMWare, LDOMs, Zones, Containers.

Hypervisor : Software that emulates a hardware platform, so that Operating Systems can run on top of it, as if they had hardware to run on.

OS Virtualization: When you have one OS (one kernel) running multiple user-spaces. Applications think they are on separate machines.

There are two kinds of Hypervisors, some run directly on hardware (Type 1), and some run as applications (Type 2).

With those terms defined, here is a description of the technologies, features, products that I listed at the top:

  • Hypervisors:
    • Running on hardware – Type 1 Hypervisor
      • Xen: Hypervisor that runs on hardware, supports x86 (aka Sun xVM)
      • LDOMs: Hypervisor that runs on hardware, supports SPARC
    • Running as an application – Type 2 Hypervisor
      • VirtualBox: Hypervisor that runs as an application, supports x86
      • VMWare: Hypervisor that runs as an application, supports x86
  • OS Virtualization
    • Solaris Containers/Zones

The terms “zone” and “container” seem to interchangeable. I have not found a source that is both clear and authoritative that can tell me the difference.

Zones are capable of running different versions of Solaris inside one Global OS instance.

There are lots of things I glossed over here, but my goal was keep it short and sweet.

Trivia:

  • You can run a specific old version of Linux inside a Solaris zone.
  • The VMWare company probably supports products on other chips than x86
  • There are lots of differences between the features of Xen and LDOMs that I didn’t discuss

Which version of Sun Studio do I have installed?

Friday, May 14th, 2010

Recipes for supported packaging formats

Sun Studio is available on three different packaging systems. Here are some examples that show you how to get information about the Sun Studio packages on each kind of system.

  • IPS packaging system – on OpenSolaris
  • SYSV packages – on Solaris 10
  • RPMs – on SuSE and RedHat Linux

If you want to know what version of a Studio component you’re using, the steps are shown below.  The compiler or tool you’re interested in might be on your search path (you can find the location with “which cc”) or you might already know the full path.  Once you have the full path, here are the things you might want to find out:

  1. Find out the name of the package containing that binary.
  2. Dump out information about that package.
  3. Optionally look for other packages from the same Studio release, to see what else is installed.

Generally the multiple packages that make up Sun Studio will use a similar naming convention.  In the currently available releases, these package names are cryptic.

Sun Studio 12 update 1 installed on Solaris 10

What version is built into the binary?

% /opt/sunstudio12.1/bin/cc -V
cc: Sun C 5.10 SunOS_sparc 2009/06/03
usage: cc [ options] files.  Use 'cc -flags' for details

Which package is that binary in?

% pkgchk -l -p '/opt/sunstudio12.1/bin/cc'
NOTE: Couldn't lock the package database.
Pathname: /opt/sunstudio12.1/bin/cc
Type: symbolic link
Source of link: ../prod/bin/cc
Referenced by the following packages:
SPROcc
Current status: installed

What other packages are installed?

% pkginfo | grep SPRO
application SPROatd                          Sun Studio 12 update 1 Advanced Tools Development Module
application SPROcc                           Sun Studio 12 update 1 C Compiler
application SPROcmpl                         Sun Studio 12 update 1 C++ Complex Library
application SPROcpl                          Sun Studio 12 update 1 C++ Compiler
application SPROcplx                         Sun Studio 12 update 1 C++ 64-bit Libraries
...

Sun Studio 12 update 1 installed on OpenSolaris

What version is built into the binary?

% /opt/sunstudio12.1/bin/cc -V
cc: Sun C 5.10 SunOS_i386 2009/06/03
usage: cc [ options] files.  Use 'cc -flags' for details% /opt/sunstudio12.1/bin/cc -V

Which package is that binary in?

% pkg search -lp /opt/sunstudio12.1/bin/cc
PACKAGE                                   PUBLISHER
pkg:/developer/sunstudio12u1@12.1.1-0.111

What other packages are installed?

% pkg list | grep -i studio
developer/sunstudio12u1                       12.1.1-0.111    installed  -----

Sun Studio 12 update 1 installed on SuSE 11 Linux

What version is built into the binary?

% /opt/sun/sunstudio12.1/bin/cc -V
cc: Sun C 5.10 Linux_i386 2009/06/03
usage: cc [ options] files.  Use 'cc -flags' for details

Which package is that binary in?

% rpm -qf /opt/sun/sunstudio12.1/bin/cc
sun-cc-12.1-1

What other packages are installed?

% rpm -qa | grep sun- | head
sun-lang-12.1-1
sun-idext-12.1-1
sun-mr3m-12.1-1
sun-prfan-12.1-1
sun-stl4h-12.1-1
sun-cplx-12.1-1
sun-dbxx-12.1-1
sun-pls-12.1-1
sun-dwrfs-12.1-1
sun-rtmx-12.1-1
...

Notes

The excessively terse naming convention is because of the ancient restrictions in AT&T System V UNIX that limited package names to 9 characters.   Sun also made an early decision to prefix packages names with 4 letters to mark the part of the company that was releasing the packages.  In all fairness, Sun was trying to invent a scheme where outside software vendors could reasonably choose package names without accidentally conflicting with any of the Sun packages.  That’s difficult to do in only 9 characters.  On OpenSolaris, you can see that we merged everything into one package.  Because the friendly new packaging system is one of the highlights of OpenSolaris, we didn’t want to confuse new users with the multitude of small packages we have for Sun Studio.

Hopefully, this information will be useful in a variety of circumstances. Inside the Studio team, we need to go back and forth between all three packaging systems, and it’s not easy to remember the right system commands to work with the packages on a given system. In the support team, one of the first things they ask a customer is which version of the Sun Studio software they are running. It’s also possible to install subsets of Sun Studio, so you may want to know which tools are currently installed.

Note: Studio will actually run fine on lots of different versions of Linux, including distributions that don’t use RPM as their native package format (like Ubuntu).  The tarball downloads are useful for those Linux distributions.

Code Bubbles

Saturday, March 13th, 2010

This is the IDE for me.  They start talking about debugger functionality about 75% of the way through.  IDEs are all about navigating huge amounts of information. Code Bubbles (http://www.cs.brown.edu/people/acb/codebubbles_site.htm)

Sun Studio uninstall problems (Sun Studio 12 update 1)

Monday, December 21st, 2009

If you installed the initial release of Sun Studio 12 update 1 (around June of 2009) you might have some problems running the uninstall script that came with it.  Our installer guru came up with a “workaround” script which is now available for download on the Sun Download Center.  You can find a description of the problems and a link to the script on the Sun Studio web site’s Troubleshooting Page.  You may also find it useful to check the Sun Studio 12 update 1 installation guide.  Some of the failure modes may show you errors like this:

The local registry (/root/.nbi/registry.xml) could not be loaded, or was loaded partially.
The installer can continue to work normally, but doing so may result in a corrupted global registry.

As Sun moves towards using the IPS packaging system, we’ll be able to rely more on the packaging tools built in to Solaris, and we won’t have as many issues like these.  I’m looking forward to it.

Facets of Programming

Tuesday, November 10th, 2009

I’ve been thinking recently about the fact that the average piece of software code includes instructions to the compiler mixed together with instructions that should be executed at runtime.  Type declarations are instructions to the compiler. Most of the general sequential code is instructions that should be executed at runtime.  It occurred to me these are just two of many facets of software.  It would be nice to enable all these facets to be mixed together into one document so that the author of the software can keep all the facets consistent.

Another facet is specifications or unit tests.  I group those together because the way they’re tied to software at the code level is very similar, and they serve similar purposes.  There is an approach to coding called “Test Driven Design” where unit tests are written simultaneously with individual chunks of code.  There is a variant of this called “Behavior Driven Design”.  I was exposed to BDD in the latest Scala book (Programming Scala) and that was when I realized the TDD is really about verifiable specification, not so much about testing.

I really don’t want to use something that’s just a “programming language”, I want to use a “Software Authoring System”.

So what are the facets that a good “Software Authoring System” needs?

Runtime instructions: The purest expression of this facet is in dynamically typed languages, because they omit static type declarations.

Compiler instructions: Static type declarations for variables are instructions to the compiler. Type definitions themselves (in static or dynamic languages) are partly for the benefit of the compiler, and partly for the specification of runtime behavior.  Explicit testing and runtime manipulation of types (metaprogramming) uses types as part of the runtime behavior of the program. Virtual dispatch uses type information to determine runtime behavior. But non-virtual dispatch is really just a hint to the compiler about what code is going to be associated with what data. The behavior of such code is wired down at compile time.  The compiler uses it to optimize, and report programming errors back to the user.

The way that instructions are provided to the compiler should be rethought.  The declarative style of such instructions should be retained, but the functionality should be extensible through code that’s integrated with the project code. If I don’t like the way the static type system works (as supplied by the environment), I should be able to write extensions to it that will be executed by the compiler when it compiles my code. Among other benefits, this would allow me to implement better Domain Specific Languages and add better support for static analysis tools.  Moving the language complexity associated with static typing into a user-extensible library would also streamline the core language specification. The implementation of this feature would be more natural in a language where the compiler could just as easily interpret code as compile it, like dynamic interpreted languages.

Documentation: Embedding chunks of documentation inside your source code is a good start, and extracting method signatures is also useful (ala javadoc).  But a truly integrated system could provide much more information about interface specifications, preconditions, postconditions, etc.

Interface specification: If a public function takes arguments including a list and an integer that must be less than or equal to the length of the list, how does the author encode that information into the source code?  They can put it in comments.  They can add an assert statement (which will likely be ignored by the compiler, optimizer, documentation system etc). They can use TDD to create a test case that ensures the module throws an exception if the precondition is violated.  None of that goes far enough.  This kind of specification needs to be supported directly by the programming language and tied into the other facets of programming.

Module definition: The source code structures used to create a piece of software (a reusable module) are often not the same structures that you want to use to control how that software is used by other components.  That’s why programming languages support Classes and instances for object oriented design, and also support some concept of modules or packages for controlling the import and export of software interfaces to other components.  In most cases, this module/package support is a very thin layer glued onto the outside of a programming language.  For example, when creating a shared library on UNIX, there are linker-specific ways to enumerate which symbols are visible to consumers of the library.  There are also platform-specific hooks to allow this information to be passed into the linker from the source code, but again, it’s not truly integrated into the programming language.

Optimization: The author of a component usually needs to concern themselves (at some level) with basic choices that affect performance.  In some cases this requires in-depth bit-twiddling and care selection of compiler options, but in some cases it just means choosing data structure implementations that are appropriate for the task at hand.  As an author, I don’t want to have to use the Makefile to assign different optimization levels to different source files. I’d like to declare that a particular chunk of code needs to be heavily optimized and have the compiler just do the right thing.

Binding: Two kinds of binding are important to me as an author. My component will need to bind to the implementations of the interfaces it needs to be complete. Also, other components will bind to my component. The way I facilitate external binding is really part of the “module definition” facet discussed above.  The binding facet concerns itself with how to control the way that a component binds to its own required components. In some cases you want this binding to be via copy inclusion (think of archive libraries or combining .class files into a .jar file). In some cases you want to bind to external component, and include a version dependency in your component’s binary representation. In my source code I can say “import webservices.securesocket”, should I really have to go over to the makefile or build.xml in order to specify which version of the package I need to depend on?  A lot of what currently needs to go in the build structures should be folded into a new style of “rich” source code.

Static analysis: There’s a very useful development cycle in statically typed languages where you iterate between the compiler and editor while the compiler tells you about static typing errors in your source code.  But there are other static analysis tools besides the compiler.  The “compiler instructions” programming facet that I described above should be extended to include giving instructions to multiple static analysis tools in a unified way. Alternatively, the compiler can be extended as a universal front-end to outside analysis tools. Either way, this facet of programming should be expanded.

With this understanding as a basis, the next exercise would be to define a streamlined language that could be used as the basis for this kind of modern authoring system.