Friday, 18 February 2011


Polymorphism is one of the object oriented paradigms that leads to pain when optimising.



Polymorphism in itself isn't a bad thing. It's a useful design mechanism and has its place. The virtual function call adds a level of abstraction that makes it difficult to port to SPU or GPU architectures and adds a barrier to many optimisation techniques.

A common example is using polymorphism and virtual functions to process a list of heterogenous types with a common base class.

This is bad because:
(a) You're adding the cost of virtual function calls to every object
(b) You're dispatching to different pieces of code, in a potentially incoherent order, causing I$ thrash.
(c) You often end up with oversized objects that carry dead code and data around, again hurting cache efficiency
(d) It's harder to port the code to the SPU (or GPU)

I've been exploring an alternative way of structuring this approach. I've retained a polymorphic external interface for registering objects of different types in a list. However, rather than have a single add() function that takes the base type, I've explicitly made several overloaded functions that take the derived types. Each of these functions adds that object to a list of objects of the same specific derived type. So, instead of having one big heterogenous list, internally I have several type-specific homogenous lists.

This way, I eliminate all virtual calls in my processing-side code. I can now process all of the objects of type X, Y and Z. I can explicitly call X::process() in a list, Y::process(), Z::process() and so on. This makes the code much easier to optimise and much easier to port to the SPU because you no longer need to know about the vtable. I now have much better I$ coherency.

Who said we must have one list, after all? Why does the contained object have to pay the cost of the abstraction mechanism? Doesn't it make more sense for the containing object to pay the cost? And a cheaper cost at that! Provided the set of types you are dealing with remains reasonable, maintaining multiple separate lists can be an easy way to enhance efficiency and make your code easier to port to SPU.


  1. It's a good idea, and will be much more practical in C++0x with variadic templates. In the meantime you could either use type lists or just do it ad hoc each time you need it (would be nice to have a reusable solution to this though).

  2. if (numberOfInstances.IsConsistentlyGreaterThan(numberOfClasses) ||
    { ...use specific arrays of specific things }
    { ...use polymorphism }

    you can get the right answer easily by just writing the smallest code as and when.

    if doing a driving game, you have an array of cars. easy. no need to figure out a polymorphic framework.

    A decade later you might find yourself part of some torturous committee process that has to argue for years over how to do this sort of thing (eg someone disconnected from development might have decided it would be great if all games used the same framework), and find yourself porting car update & AI code onto SPU's because its doing car damage calculations every frame or something silly like that.

    object orientation can be an over-rated concept - great to see the enthusiasm for haskell on this blog. writing and re-using functions does the trick. writing custom iterators (macros and templates) does the job of frameworks.

    Some programmers seem to assume C++ virtuals are the ONLY way of doing polymorphism.. they're just one of many methods. ( & one of 2 methods I know of that just happened to get an extention of C built for them)

    If you ever built your own class system in C or assembler you'd be
    dis-satisfied with the C++ virtuals method because there is no "static virtual data" (a datastructure per class).

    [4] another reason to not use polymorphic frameworks is the need to break up long update functions, in the case the update function itself thrashes the icache

  3. Tom Hammersley20 June 2011 at 14:31

    [1] Indeed, there are limits to the scalability and practicality of this approach. But still, that's where refactoring comes in. If the number of classes or instances in a system is scaling beyond that which it was originally designed for, it'll show growing pains elsewhere, too.

    I don't suppose this is an appropriate solution for every design problem, but it can have some useful applications.

    [2] Agreed - I think in many instances we put the cart before the horse. One of the key raison d'etres for OO was re-use; that's what we were realy getting at. OO delivered that, but it also delivered a little extra baggage with it. These days coding seems to be returning to a more heterogenous approach.

    People are happy to just write a function, when all that's needed is a function and a function will do.

    [4] ... plus a long update function is a killer blow for any kind of attempt to parallelise a system.