Posts Tagged ‘Parallelism’

Native Posix Python Condition Implementation

So I wrote this replacement version of Condition using the native posix support. Event and Semaphore are both written in terms of Condition, so you can use this as a fast route to getting native versions of those synchronization primitives. (Note, though, that there is a native posix semaphore, so implementing it in terms of a condition variable is not really necessary.)

I have trained myself to ask, "why hasn't anybody done this before?" when writing this sort of thing. And as always there's a really good reason: for a percentage of applications that is probably very close to one hundred percent, the difference in performance between this and the pure python version (which is implemented using polling) is not going to amount to a hill of beans. That was indeed the case for the application where I was trying to put this to use, and I reverted to the much simpler python Condition.

But there was no way to know for sure that that was the case without trying this, so if you find yourself in a similar situation, here is the code.

#include "Python.h"
#include "pthread.h"
#include "structmember.h"

typedef struct {
  PyObject_HEAD
  int set;
  pthread_cond_t cond;
  pthread_mutex_t lock;
} ConditionObject;

static int cond_init( ConditionObject* self, PyObject* args,
              PyObject* kwargs );
static void cond_free( ConditionObject* self );
static PyObject* cond_acquire( ConditionObject* self );
static PyObject* cond_release( ConditionObject* self );
static PyObject* cond_wait( ConditionObject* self, PyObject* args );
static PyObject* cond_notify( ConditionObject* self );
static PyObject* cond_notifyAll( ConditionObject* self );

static PyMemberDef cond_members[] = {
  {NULL}
};

static PyMethodDef cond_methods[] = {
  { "acquire", (PyCFunction)cond_acquire, METH_NOARGS, "" },
  { "release", (PyCFunction)cond_release, METH_NOARGS, "" },
  { "wait", (PyCFunction)cond_wait, METH_VARARGS, "" },
  { "notify", (PyCFunction)cond_notify, METH_NOARGS, "" },
  { "notifyAll", (PyCFunction)cond_notifyAll, METH_NOARGS, "" },
  { NULL }
};

static PyTypeObject ConditionType  = {
  PyObject_HEAD_INIT(NULL)
  0,                         /*ob_size*/
  "_pthread_cond.Condition", /*tp_name*/
  sizeof(ConditionObject),   /*tp_basicsize*/
  0,                         /*tp_itemsize*/
  (destructor)cond_free,     /*tp_dealloc*/
  0,                         /*tp_print*/
  0,                         /*tp_getattr*/
  0,                         /*tp_setattr*/
  0,                         /*tp_compare*/
  0,                         /*tp_repr*/
  0,                         /*tp_as_number*/
  0,                         /*tp_as_sequence*/
  0,                         /*tp_as_mapping*/
  0,                         /*tp_hash */
  0,                         /*tp_call*/
  0,                         /*tp_str*/
  0,                         /*tp_getattro*/
  0,                         /*tp_setattro*/
  0,                         /*tp_as_buffer*/
  Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,        /*tp_flags*/
  "",                        /* tp_doc */
  0,                       /* tp_traverse */
  0,                       /* tp_clear */
  0,                       /* tp_richcompare */
  0,                       /* tp_weaklistoffset */
  0,                       /* tp_iter */
  0,                       /* tp_iternext */
  cond_methods,              /* tp_methods */
  cond_members,              /* tp_members */
  0,                         /* tp_getset */
  0,                         /* tp_base */
  0,                         /* tp_dict */
  0,                         /* tp_descr_get */
  0,                         /* tp_descr_set */
  0,                         /* tp_dictoffset */
  (initproc)cond_init,       /* tp_init */
  0,                         /* tp_alloc */
  0,                         /* tp_new */
};

static int cond_init( ConditionObject* self, PyObject* args,
              PyObject* kwargs ) {
  self->set = 0;
  int err = pthread_mutex_init( &self->lock, NULL );
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return -1;
  }
  err = pthread_cond_init( &self->cond, NULL );
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return -1;
  }
  return 0;
}

static void cond_free( ConditionObject* self ) {
  pthread_mutex_destroy( &self->lock );
  pthread_cond_destroy( &self->cond );
}

static PyObject* cond_acquire( ConditionObject* self ) {
  int err = 0;
  Py_BEGIN_ALLOW_THREADS;
  err = pthread_mutex_lock( &self->lock );
  Py_END_ALLOW_THREADS;
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyObject* cond_release( ConditionObject* self ) {
  int err = pthread_mutex_unlock( &self->lock );
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyObject* cond_wait( ConditionObject* self, PyObject* args ) {
  // for now timed waits are not supported (it's ignored)
  int err = 0;
  while( !self->set && err != EINVAL ) {
    Py_BEGIN_ALLOW_THREADS;
    err = pthread_cond_wait( &self->cond, &self->lock );
    Py_END_ALLOW_THREADS;
    if( PyErr_CheckSignals() ) {
      return NULL;
    }
  }
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyObject* cond_notify( ConditionObject* self ) {
  self->set = 1;
  int err = 0;
  Py_BEGIN_ALLOW_THREADS;
  err = pthread_cond_signal( &self->cond );
  Py_END_ALLOW_THREADS;
  self->set = 0;
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyObject* cond_notifyAll( ConditionObject* self ) {
  self->set = 1;
  int err = 0;
  Py_BEGIN_ALLOW_THREADS;
  err = pthread_cond_broadcast( &self->cond );
  Py_END_ALLOW_THREADS;
  self->set = 0;
  if( err != 0 ) {
    PyErr_SetFromErrno( PyExc_OSError );
    return NULL;
  }
  Py_RETURN_NONE;
}

static PyMethodDef module_methods[] = {
  {NULL}
};

PyMODINIT_FUNC
init_pthread_cond(void)
{
  ConditionType.tp_new = PyType_GenericNew;
  if( PyType_Ready( &ConditionType ) < 0 ) {
    return;
  }
  PyObject* mod = Py_InitModule( "_pthread_cond", module_methods );
  if( mod == NULL ) {
    return;
  }
  Py_INCREF( &ConditionType );
  PyModule_AddObject( mod, "Condition", (PyObject*)&ConditionType );
}

First Impressions of CCR

I got a chance to mess around with the Concurrency and Coordination Runtime (CCR) bits recently. Before I get into that, first check out this real-life code I wrote this week.

public static class NotificationQueue
{
    private static Queue<Notification> _queue;
    private static Semaphore _work;

    static NotificationQueue()
    {
        _queue = new Queue<Notification>();
        _work = new Semaphore(0, int.MaxValue);
    }

    public static void Enqueue(Notification n)
    {
        lock (_queue)
        {
            _queue.Enqueue(n);
        }
        _work.Release();
    }

    public static Notification Dequeue()
    {
        _work.WaitOne();
        lock (_queue)
        {
            return _queue.Dequeue();
        }
    }
}

The idea here is to post notifications to the queue from many threads, and have a single notification thread sending the messages out from something like this:

private static void NotifyThreadProc()
{
    while(!Abort())
        NotificationQueue.Dequeue().Send();
}

The documentation for CCR is pretty sparse at this point, limited to just this paper [pdf], the Channel9 video, and this Wiki. Some of the particulars seem to have changed significantly, but I was able to figure out how to replicate my explicitly-threaded notification queue:

public class NotificationService : CcrServiceBase
{
    private Port<Notification> _port;

    public NotificationService()
        : base(new DispatcherQueue("foo"))
    {
        _port = new Port<Notification>();
        Activate(Arbiter.Receive(true, _port,
        delegate(Notification n)
        {
            n.Send();
        }));
    }

    public void Post(Notification n)
    {
        _port.Post(n);
    }
}

That's pretty awesome: notice that no explicit locks or waits are necessary, and I don't need to write the NotificationThreadProc or the code required to start it. I just need to make a NotificationService and start posting to it.

I can't say it's the most immediately comprehensible API I've ever seen, but hopefully that will change with more documentation and some polish. It's also possible that I am just warped from years of using the lower-level concepts. Overall this is awfully impressive for a library.

Some Twists on Blocked Finalizers

Blocked finalizer threads have gotten some recent publicity on Tess Ferrandez's blog. I recently ran into this myself, although the particulars were slightly different.

Our problem was manifesting itself in a very long-running console application that both connected to a database and made web service calls. Eventually the app would start hitting OutOfMemoryExceptions. (Completely unrelated to these OutOfMemoryExceptions. We have a high incidence of OutOfMemoryExceptions around here.)

Fixing the Wrong Problem and Moving the Goalposts

My initial reaction to this problem was (sensibly, I think) to attempt to reduce the amount of memory that the application was using. I admit that I was also focused on different problems at the time, so any makeshift solution I could get to work here would have been just fine with me.

We threw in some buffer pools, et cetera, and managed to get the app running slightly longer, but the OOM's always resurfaced. Every time they did, the heap looked completely different due to the semi-drastic changes we were making to the app's memory profile. After the third or maybe fourth change it dawned on me that perhaps we were just confusing the real issue.

The Proverbial Lightbulb Over the Head

The turning point came when I noticed that most of the overly-abundant objects on the heap in a particular dump were types that were likely to be associated with finalizers.

0:000> !dumpheap -stat
------------------------------
...
2,431,972    48,639,440 System.Threading.WaitHandle/__WaitHandleHandleProtector
...

This hadn't been the case in the previous dumps. Those were eaten alive by byte arrays and strings. By this point, we had done such a fantastic job of optimizing the application for memory usage that the problem came more clearly into focus.

The Road to Victory

At this point I used !threads in SOS to locate the finalizer thread (some useless info trimmed out here):

0:000> !threads
ID APT   Exception
0  STA System.OutOfMemoryException
4  MTA (Finalizer)

Thread 4 was stuck in a WaitForSingleObject and, not unrelated to the problem here, the native stacks of almost all of the other threads in the process were in some variation of "WaitUntilGCComplete." Probably should have noticed that sooner. So sue me.

Anyway, the top of the finalizer's stack was interesting:

0:004> ~4k
ChildEBP RetAddr
00f0f658 7c822124 ntdll!KiFastSystemCallRet
00f0f65c 77e6baa8 ntdll!NtWaitForSingleObject+0xc
00f0f6cc 77e6ba12 kernel32!WaitForSingleObjectEx+0xac
00f0f6e0 776c54ef kernel32!WaitForSingleObject+0x12
00f0f6fc 77789905 ole32!GetToSTA+0x6f
00f0f71c 77787ed7 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xcb

The finalizer is waiting to make a call over to an STA thread. Why's it doing that? And wait, what STA thread? Funny you should ask.

Visual Basic Complications

No technical problem at my post-VBScript company would be complete without Visual Basic putting a unique spin on things. This one is no exception. The authors of this app are fairly oblivious to COM apartments and didn't intentionally create any STA threads. They also weren't using COM Interop directly.

So how did thread zero end up being an STA thread (the astute among you may have noticed this in the !threads output above)? The MSDN documentation seems to clearly say that MTA is the default.

The reason was that the app was written in Visual Basic. If the Main() method in a VB program is lacking the STAThreadAttribute and the MTAThreadAttribute, the VB compiler will stick an STAThreadAttribute on it for you when the MSIL is emitted (the C# compiler doesn't do this). Presumably–and I'm guessing here but I think this is a good guess–this is done for compatibility with VB6.

The main thread of this application was sleeping the vast majority of the time. Admittedly, this is not a fantastic example of multithreaded design, but it wouldn't have caused any problems if the thread were MTA. As Raymond has pointed out, sleeping on an STA thread can have some ill effects.

The Fix

This problem was solved by decorating the main method with <MTAThread()>. This is possibly the smallest fix to an apparently massive problem I have ever personally witnessed.

How to Notice This Problem Much Sooner

There are a few ways this could have been less painful. The easiest might have been to check the output of SOS's !finalizequeue command. If I had done that I would have seen the following (again trimmed for clarity):

Heap 0
generation 0 has 0 finalizable objects
generation 1 has 0 finalizable objects
generation 2 has 11,212 finalizable objects
Ready for finalization 1,163,295 objects

Which does not look healthy. Looking more closely at the native stacks would have also probably yielded a solution on day one.

Generic and Threadsafe Singleton Implementation

I googled around, and couldn't find a generic singleton implementation that was 1) correct and 2) met all of my needs. This is a clever approach, but unfortunately it is limited to objects that are created by calls to constructors.

So I went ahead and wrote the singleton generic that I fully expect to be included in the next BCL. (Not really. It's completely straightforward. However, the Steelers won the super bowl yesterday, so I'm not really in a modest mood.)

I started out defining a class factory type, which is responsible for creating the an instance in a purposefully vague way. I also wrote the default version which just calls new().

ClassFactory object model

/// <summary>
/// Interface for objects that create instances of
/// another type.
/// </summary>
public interface IClassFactory<T> where T : class
{
        T CreateInstance();
}

/// <summary>
/// A default <see cref="IClassFactory"/> implementation,
/// which uses a parameterless constructor to create the
/// instance.
/// </summary>
public class DefaultClassFactory<T> : IClassFactory<T>
        where T : class, new()
{
        public T CreateInstance()
        {
                return new T();
        }
}

From there, I went for the slam dunk in writing both a singleton class with a class factory and a default version that doesn't require one.

Singleton object model

/// <summary>
/// A base (or helper) singleton class. Defines the
/// singleton instance.
/// </summary>
/// <typeparam name="T">
/// The type of the singleton object.
/// </typeparam>
/// <typeparam name="class_factory">
/// The type of the class factory to use to create an
/// instance of type <typeparamref name="T"/>.
/// </typeparam>
public class Singleton<T, class_factory>
    where T : class
    where class_factory : IClassFactory<T>, new()
{
    private static object _sync = new object();
    private static T _default;

    /// <summary>
    /// Gets the singleton instance.
    /// </summary>
    public static T Default
    {
        get
        {
            EnsureDefault();
            return _default;
        }
    }

    /// <summary>
    /// Ensures that the singleton has been created.
    /// </summary>
    private static void EnsureDefault()
    {
        if (_default == null)
        {
            lock (_sync)
            {
                if (_default == null)
                {
                     CreateDefault();
                }
             }
        }
    }

    /// <summary>
    /// Uses the class factory to create the instance.
    /// </summary>
    private static void CreateDefault()
    {
        class_factory cf = new class_factory();
        T value = cf.CreateInstance();

        // This ensures that writes in the creation of
        // the default instance won't be shuffled beyond
        // the write to _default. Only matters on multiproc
        // machines where the hardware allows this. Does
        // nothing on an x86.
        Thread.MemoryBarrier();
        _default = value;
    }
}

/// <summary>
/// A basic singleton type that can be used with objects
/// that are created with a parameterless constructor.
/// </summary>
public class Singleton<T> :
    Singleton<T, DefaultClassFactory<T>>
    where T : class, new()
{
}

(For more on why I added that call to Thread.MemoryBarrier(), see this writeup.)

Here's a really simple example that does not take advantage of the customizability:

public class Foo : Singleton<Foo>
{
}

If you can't use Singleton as a base class, you have to write a little more code.

public class Foo : Bar
{
    public static Foo Default
    {
        get { return Singleton<Foo>.Default; }
    }
}

Concurrency Approaches Contrasted

Here are a few roughly-equivalent class declarations using different languages and libraries. I say "roughly equivalent" because the implementation details and performance characteristics may actually be quite different in each case. However, the end result of each is that a function on a member variable is called in a non-blocking fashion, synchronized (presumably against other operations on the same class) with the use of some resource A.


// Vanilla C# with locks
public class FooBar
{
    private object A = new object();
    private T _member = new T();
    private delegate void FooBarDel();

    public void Foo()
    {
        FooBarDel del = new FooBarDel(FooBar);
        del.BeginInvoke(
          new AsyncCallback(FooBarDone), del);
    }

    private void FooBar()
    {
        lock(A)
        {
            _member.DoSomething();
        }
    }

    // You can typically get away without this, but
    // the documentation recommends otherwise.
    private void FooBarDone(IAsyncResult r)
    {
        FooBarDel d = r.AsyncState;
        d.EndInvoke(r);
    }
}

// Comega using a chord join pattern
public class FooBar
{
    private T _member = new T();
    private async A();
    public async Foo() & A()
    {
        _member.DoSomething();
        A();
    }

    public FooBar()
    {
        A();
    }
}

// C# 2.0+ using CCR
public class FooBar
{
    private T _member = new T();
    private Port<int> _p = new Port<int>();

    public void Foo()
    {
        _p.Post(1);
    }

    public FooBar()
    {
        activate(!_p.with(delegate(int i)
        {
            _member.DoSomething();
        }));
    }
}

Looking at the other two examples I think it's clear that C# as it stands today is lacking. That's not exactly a groundbreaking statement, I realize, but I haven't seen many examples putting all of these together in one spot.

Comega is very elegant in a situation as simple as this. CCR also seems less mistake-prone than the plain C# version, but the example I picked isn't where CCR really shines.

(One thing I should say is that I'm not sure that the CCR example is correct or even compiles, since the library isn't public yet. I worked this out from reading the whitepaper [pdf]. It's only a few pages, so I recommend giving it a read.)

The goal of CCR has been to allow complex constructs such as this (also from the whitepaper) to be made easily, and dynamically if necessary:


    // The finish operation or the update operation
    // execute with exclusive control, whereas the
    // GetState and QueryState operations execute
    // concurrently. Operations of either type are
    // interleaved safely (using the '^' operator).
    activate(exclusive(p.with(DoneHandler),
        !p.with(UpdateState))
        ^
        concurrent(!p.with(GetState),
        !p.with(QueryState))
    );

Comega, on the other hand, provides static language features. The good news is that since CCR is just a library, it would be usable from a future C# (4.0, one would hope) that incorporated some of the ideas in Comega at the language level for the easier cases.

I had thought it was a shame that none of the Comega concurrency constructs had made it into the 3.0 C# specification. However, with CCR I think I can see why this is the case. CCR, or something like it, should offer a lot of power and flexibility well within the 3.0 timeframe without taking the risk of baking the 'wrong' keywords into C# itself.

Wrote a Concurrency Library for Everyone

Herb Sutter says the concurrency revolution is coming. This guy says, “*** you!! the concurrency revolution is coming!”

That is the kind of attitude we need in the industry. We need more in-your-face greeks and fewer genial mustachioed C++ architects. Call me crazy.

I know this is the type of enthusiasm that is typically found in insane people building anti-gravity machines in their backyards, but I think these guys might be on to something with their Concurrency and Coordination Runtime (CCR).

I highly recommend watching the video–even if you have no idea what these people are talking about. Seriously mind-blowing stuff.

Nonblocking Pool Class

This is not an original idea but I thought I would post/explain it anyway. This is a generalized version of a pattern I have been using for a while. I'm not sure where I first picked it up but I've seen it used in several places.

The purpose of this class is to pool instances of a particular type in a server application. The assumptions I am making about the problem are:

  • It is both possible and worthwhile to reuse instances of a certain type. Types that may fit this criteria are large arrays of primitive types, types that hold unmanaged or scarce resources such as connections, et al. Not all types fit this criteria, obviously.
  • It is more undesirable to have a thread enter a waiting state (fail to acquire a lock, in other words) than it is to create a new instance of the type being reused. That would be the case if the instances are somewhat cheap but the average request or call time to your server is relatively long.

The nice thing about this pool class is that it handles the second case gracefully. It will reuse objects as much as possible, but it won't block a thread in the case that the attempt fails. If it didn't, you might end up introducing massive contention in your attempt to increase throughput with a different, locking pool.

The class provides very lightweight synchronization using atomic operations - there's no use of critical sections (the lock keyword).

  /// <summary>
  /// Provides and reuses objects of type <typeparamref name="T"/>.
  /// </summary>
  /// <typeparam name="T">
  /// The type that is pooled. Must provide a default constructor.
  /// </typeparam>
  public class NonBlockingPool<T>
     where T : new()
  {
     // Contains the pooled items.
     private Stack<T> _stack;

     // The maximum size of _stack.
     private int _max;

     // This reference is used to ensure that only one thread
     // calls methods on _stack at a time.
     private object _lock = new object();

     /// <summary>
     /// Gets or sets the maximum size of the pool.
     /// </summary>
     public int MaximumSize
     {
        get { return _max; }
        set { _max = value; }
     }

     /// <summary>
     /// Gets a pooled instance of type <typeparamref name="T"/>,
     /// or yields a new instance.
     /// </summary>
     public T Get()
     {
        // If two threads enter this method at the same time,
        // only one will acquire _lock (the other will be given
        // null). The caller that fails to acquire _lock will
        // be returned a new instance of T.
        T ret = default(T);
        object obj = Interlocked.Exchange(ref _lock, null);
        try
        {
           if (obj != null && _stack.Count > 0)
           {
              ret = _stack.Pop();
           }
           else
           {
              ret = new T();
           }
        }
        finally
        {
           if (obj != null)
           {
              _lock = obj;
           }
        }
        return ret;
     }

     /// <summary>
     /// Reuses an instance of type <paramref name="T"/> in a
     /// subsequent request or call whenever possible.
     /// </summary>
     public void Reuse(T t)
     {
        // If two threads enter this method at the same time,
        // only one will acquire _lock (the other will be given
        // a null reference). The instance of T provided by
        // the losing thread will just be collected and not
        // reused.
        object obj = Interlocked.Exchange(ref _lock, null);
        try
        {
           if (obj != null && _stack.Count < _max)
           {
              _stack.Push(t);
           }
        }
        finally
        {
           if (obj != null)
           {
              _lock = obj;
           }
        }
     }

     /// <summary>
     /// Constructor.
     /// </summary>
     /// <param name="max">
     /// The maximum number of instances of
     /// <typeparamref name="T"/> to hold in the pool.
     /// </param>
     public NonBlockingPool(int max)
     {
        if (max < 0)
        {
           throw new ArgumentOutOfRangeException("max");
        }
        _stack = new Stack<T>(max);
        _max = max;
     }
  }

Here's a (contrived) minimal example of a consumer of such a pool. This server class makes a context object available to each thread for the duration of each request. This object is stored in a slot unique to each thread (specified with the ThreadStaticAttribute) while a ProcessRequest function is called. The instance is returned to the pool in a finally block after that call is finished.

  public class SampleServer
  {
     [ThreadStatic]
     private static ServerContext _context;

     private NonBlockingPool<ServerContext> _pool;

     public static ServerContext Context
     {
        get { return _context; }
     }

     internal void ProcessRequest(IServerApp app)
     {
        try
        {
           _context = _pool.Get();
           app.ProcessRequest();
        }
        finally
        {
           if (_context != null)
           {
              _context.Reset();
              _pool.Reuse(_context);

              // We want the context to be collected if it isn't
              // actually reused by the pool.
              _context = null;
           }
        }
     }
  }

A more concrete example might be an IHttpModule or a remoting server channel sink. As I said once already, it's important to consider 1) the type of resource you are pooling and 2) the amount of load your application is expecting before committing yourself to a pattern such as this one.

Moore’s Law and the Free Lunch

This article was brought to my attention from a few sources. The general theme here is, “concurrency is going to be really important as processors begin to hit physical limits, and these kinds of programs are harder to write.” I thought I would give my spin on it. Here is another one-sentence summary of that article, childishly represented using Windows Paint.

Your friend Jack Albertson

At the admitted risk of sounding too much like a doomsday prophet, my prediction can be summed up as follows: despite attempts at tool and language support (for instance, ), this is going to be painful for a large percentage of developers. Software cycles will probably take a big turn for the worse, so you might be better off working on the quantum computers in your garages now.

My reasoning is heavily influenced by the alleged “object revolution.” The fact is, you can’t claim to be doing object-oriented development just by virtue of using a language that has object-oriented features. And you can’t reap the benefits of doing object-oriented development in that case, either. In this day and age, I still see a ton (scores… hundreds.. maybe thousands) of methods that are 200 lines long and take twelve arguments. Now you can put a “virtual” in front of a method like that, but there’s obviously still a problem.

I like this quote from Object Thinking:

Both software engineering and object orientation have achieved a strange status - everyone claims to be doing them without really doing so.

The thesis of that book is that OOP and traditional programming take drastically different mindsets. Changes in mindsets can be really difficult to accomplish. Tools help you but they don’t automatically make you good at the task at hand. We’ve got great OOP tools now, but I think a lot of people still work on teams where deadlines are missed, integrating code written by different people is hard, and any number of shortcuts leads to a mess of spaghetti. We get away with it to an extent, mostly because it is accepted that software projects are late and contain bugs. We work way too hard in the process, though.

And as accurate as I think that is for the “object revolution”, it is only more accurate for the upcoming “concurrency revolution.” Writing a multithreaded app is a lot different than writing a single threaded app.

It could be good news for the highly motivated / educated, but as in any field that is the minority. I’m looking forward to the challenge, but purely for selfish reasons.