The ZOOM ErrorLogger Package:


The General Idea

The Need

The program has initiated multiple threads (probably pthreads but this is irrelevant) which share the same address space for global variables. Each thread may on occasion issue error messages using the ZOOM ErrorLogger package.

Using the naive ErrorLog mechanism, two threads issuing error messages at overlapping times will each send instructions to the same ELadministrator. This leads to the possibility of garbled messages. Instead, what is wanted is some way for each thread to independently compose a message, and each message to emerge as one uninterrupted chunk on the destinations.

Meeting this Need

The package now provides a new class ThreadSafeErrorLog derived from ErrorLog. If each thread instantiates and uses its own instance of ThreadSafeErrorLog, then messages composed via these will not garble each other.

The strategy employed is that the ThreadSafeErrorLog has its own area for building up a message, rather than using the common one in ELadministrator. When the message is completed, the ThreadSafeErrorLog will obtain a lock on the right to be sending a message to the administrator, send the message, and release the lock.

An example of how this can be used is supplied in MultThread.h and MultThread.cc. Here, care is taken to set things up so that each thread has its own ThreadSafeErrorLog, known as errlog. Error messages are then issued in exactly the same way as they would have been through the normal errlog, and the behavior is also the same except that there will be no garbling of the output.

The Mutex (Locking) Mechanism

Imagine a concept of the right to issue a message to the administrator. We want to be "exception safe" -- which in this situation implies the ability to guarantee that whatever happens while that right has been acquired, you always release the right before going away. This is achieved by the idiom of "resource acquistion is object construction"; that is, a Mutex class is used to represent the right to issue a message. When about to use the administrator, the ThreadSafeErrorLog opens a code block and instantiate a Mutex; at the end of the block the destructor is automaitcally called, which releases the lock.

Now the Exceptions package has no business dictating the actual locking mechanism to the user. Instead, ThreadSafeErrorLog is templated off an arbitrary user-defined class which we name Mutex.

The Mutex class must have two properties:

A workable Mutex could be as simple as this (taken from the test program MultThread.cc):
pthread_mutex_t* mutex;		// a posix mutex, represented as a global
				// scope pointer, initialized and set up
				// at some early point in the job.
struct Mutex {                                          
  Mutex() { pthread_mutex_lock(mutex);   }
 ~Mutex() { pthread_mutex_unlock(mutex); }

Setting Up a ThreadSafeErrorLog for Each Thread

This is the non-trivial point in using the ThreadSafeErrorLog. Typically, errlog is established in one of two ways:
  1. errlog might be a global variable of type ErrorLog
  2. errlog might be a data member of some Module class, and the various physics routines might be methods in that class (or a subclass of Module) -- those methods can use errlog as if it were a global
The former method is unsuitable for using ThreadSafeErrorLog, since the one global variable would be shared (in a conflicting manner) by all the threads, leading to the same garbling as observed with ErrorLog.

The latter method, having errlog as a public or protected ThreadSafeErrorLog data member in a Module class, works fine. Each thread, upon startup, instantiates whatever Modules it requires, and now each Module in each thread has its own ThreadSafeErrorLog which can accumulate its own message independant of activities of other threads.

For example, code in the MultThread example does something like:

class Module  {
  Module( const std::string & name );
  virtual void operator()( Event & e ) = 0; // do this module's processing!
  virtual ~Module();
  ThreadSafeErrorLog errlog;                     
};  // Module

class DoPhysics : public Module {
  DoPhysics( const std::string & name, int tnum ) : Module(name){}
  void operator()( Event & event ) {
  // ...
  errlog( ELsuccess, "An Event" ) << "data = " << event.data << endmsg;
  // ...
};  // DoPhysics
And each thread has its own Module because the thread program starts with:
  DoPhysics  doPhysics( name );
and causes the physics methods to be invoked via
  for ( n = 1;  n < 20;  ++n )  {
    Event event( n );
    doPhysics( event );

Main ErrorLogger Package Page

ZOOM Home Page - Fermilab at Work - Fermilab Home

Mark Fischler
Last modified: August 6, 1999