mr-edd.co.uk :: horsing around with the C++ programming language

Another use for explicit vtables

[24th April 2012]

In my previous post I outlined a technique whereby coupling could be reduced by way of creating a pseudo-vtable.

I've since realised that this technique can also be used to help create an efficient representation for polymorphic types with value semantics. Consider this base class, which we might use as the root of a polymorphic class hierarchy in a computer game:

struct monster
{
    virtual ~monster();
    virtual void attack(player &target);
};

Such polymorphic hierarchies will typically imply dynamic allocation somewhere along the line. For example, we typically won't create a std::vector<monster> because we can't store derived monster objects by value in containers without slicing. Instead, we'll have to do something like keep a container of (smart) pointers, or use a smart container such as boost::ptr_vector<monster>, which in turn means we'll probably have to allocate our derived monster objects on the heap.

The large number of small allocations implied by this approach can be detrimental in a number of ways; some of the locality benefits we may have got with a vector<> have been lost[1] due to the monsters now being dotted around the heap. The allocations themselves can be detrimental to performance, especially if frequent. Contention within allocation routines running in multi-threaded applications can exacerbate this problem.

Let's see how we might avoid these problems by adapting the explicit vtables technique described previously.

Constraints

Right from the start, we're going to assume that all our derived classes are of similar size, and the maximum size of any derived monster type is known. Clearly these constraints will be impractical for some class hierarchies, but let's see what this restriction buys us.

Given that any type of monster will fit in to, say 32 bytes, we can use an appropriately aligned byte buffer to store any monster object:

union monster_storage
{
    void *for_alignment_1;
    std::size_t for_alignment_2;
    double for_alignment_3;
    long double for_alignment_4;

    unsigned char opaque[32]; // large enough for any monster type
};

The first few members of this union are to ensure that the byte array has an alignment suitable for a wide variety of types[2]. We can add additional types to the union, or enlarge the opaque array if we find the alignment or size to be insufficient for a particular monster type (which we will check with static assertions later on, anyway). Alternatively, you could use something like boost::aligned_storage<> instead. I'll stick with the union to keep the code in this post free of external dependencies.

Defining our vtable

Given a storage arena for our monsters, we can now define our pseudo-vtable structure. The function pointers are analogous to the entries in a real virtual table:

struct monster_vtable
{
    // Some generic entries
    void (*destructor)(monster_storage &self);
    void (*copy_constructor)(monster_storage &self, const monster_storage &source);

    // Entries specific to monsters
    void (*attack)(monster_storage &self, player &target);
    // ...
};

Now, the monster base class as we've defined it so far has virtual functions, but for reasons that will soon become clear, we aren't going to need any virtual functions, or even an inheritance hierarchy! All we'll require is that any monster class defines and implements the methods required to 'act' like a monster. In our little example this means that a void attack(player &) method must be present, as should a public copy constructor and destructor (which may be compiler-generated).

With a little bit of template trickery, we can define a unique monster_vtable object for each monster type we care to add:

template<typename M>
struct monster_vtable_for
{
    static M *ptr(monster_storage &s)
    { 
        return reinterpret_cast<M *>(s.opaque); 
    }
    static const M *ptr(const monster_storage &s)
    {
        return reinterpret_cast<const M *>(s.opaque); 
    }

    static void destructor(monster_storage &s) 
    { 
        ptr(s)->~M(); // explicitly call the destructor
    }
    static void copy_constructor(monster_storage &self, const monster_storage &source) 
    {
        new(ptr(self)) M(*ptr(source)); // use placement new and M's copy constructor
    }

    static void attack(monster_storage &self, player &target) 
    {
        return ptr(self)->attack(target); // assume M::attack(player &) exists
    }

    static const monster_vtable instance;
};

template<typename M>
const monster_vtable monster_vtable_for<M>::instance = {
    &monster_vtable_for<M>::destructor,
    &monster_vtable_for<M>::copy_constructor,
    &monster_vtable_for<M>::attack
};

So, for any type of monster, M, monster_vtable_for<M>::instance is a monster_vtable containing pointers to functions that delegate to the corresponding methods of M.

In the destructor() vtable entries, we're manually calling the destructor. This is needed as the destructor of the opaque byte array in monster_storage is unaware of the type of object it contains. The copy_constructor vtable entries use placement new to copy-construct an object in to the opaque byte array.

The monster class

We now have all the ingredients to define the monster class:

class monster
{
    public:
        template<typename M>
        monster(const M &m) :
            v(&monster_vtable_for<M>::instance)
        {
            // TODO: static assert goes here for size and alignment of M
            new (s.opaque) M(m); // placement new
        }

        ~monster() { v->destructor(s); }

        monster(const monster &other) : v(other.v) { v->copy_constructor(s, other.s); }

        monster &operator= (const monster &rhs)
        {
            v->destructor(s);
            v = rhs.v;
            v->copy_constructor(s, rhs.s); // TODO: consider exceptions
            return *this;
        }

        void attack(player &target) { v->attack(s, target); }

    private:
        const monster_vtable *v;
        monster_storage s;
};

Now aside from a handful of TODOs which I'll get back to shortly, we have a class that's capable of storing any monster as a value:

struct ogre
{
    explicit ogre(float health) : health(health) { }
    void attack(player &) { std::cout << "bashing player with club...\n"; }

    float health;
};

struct poltergeist
{
    poltergeist(float opacity, float speed) : opacity(opacity), speed(speed) { }
    void attack(player &) { std::cout << "slamming a door...\n"; }

    float opacity;
    float speed;
};

int main()
{
    monster m1(ogre(750));
    monster m2(poltergeist(0.07F, 12));

    player poor_sid;
    m1.attack(poor_sid);
    m2.attack(poor_sid);

    monster m3(m1);
    m3.attack(poor_sid); // also an ogre

    m3 = m2; // but now a poltergeist
    m3.attack(poor_sid);

    return 0;
}

Let's run it:

> ./poor_sid
bashing player with club...
slamming a door...
bashing player with club...
slamming a door...

We can copy monsters about without so much as tickling the heap. We can also store monster objects in containers, again by value; no small allocations, heap fragmentation, threaded allocator contention, smart pointers or smart containers, etc. So the initial constraints actually bought us quite a lot.

In an ideal world, we might imagine a programming language that could do this transformation for us, where the linker determines the size and alignment of the opaque byte array. Unfortunately, as things stand, the amount of C++ scaffolding required will likely be a consideration when evaluating the viability of this approach, though once the initial pseudo-vtable is laid out, things may not be too bad.

Ensuring correct alignment and size

Now to revisit those TODO comments. The first one appears in monster's constructor. We need to ensure that the type with which we're creating a monster will indeed fit inside the opaque byte array. We must also ensure compatible alignment.

First, let's create a macro to get the alignment of a type as a compile-time integral constant:

#if defined(_MSC_VER)
#   define ALIGN_OF(type) __alignof(type)
#elif defined(__GNUC__)
#   define ALIGN_OF(type) __alignof__(type)
#else
#   error "compiler not supported :("
#endif

If you're not using Visual C++ or GCC, you could try boost::alignment_of<>.

Here's a meta-function that we can use inside a static assertion:

template<typename T>
struct size_and_alignment_compatible_with_monster_storage
{
    static const bool value =
        (sizeof(T) <= sizeof(monster_storage().opaque)) &&
        ((ALIGN_OF(monster_storage) % ALIGN_OF(T)) == 0);
};

// ...

class monster
{
    public:
        template<typename M>
        monster(const M &m) :
            v(&monster_vtable_for<M>::instance)
        {
            STATIC_ASSERT( size_and_alignment_compatible_with_monster_storage<M>::value );
            new (s.opaque) M(m); // placement new
        }

        // ...
};

So now if we try to create a monster from an object that won't fit in to our byte array, we'll be presented with a compiler error, allowing us to adjust monster_storage as required.

Dealing with exceptions thrown while making copies

The second TODO is a little harder to resolve. Let's see it again:

monster &operator= (const monster &rhs)
{
    v->destructor(s);
    v = rhs.v;
    v->copy_constructor(s, rhs.s); // TODO: consider exceptions
    return *this;
}

If the call to v->copy_constructor() throws on the marked line, it means that we'll be left with garbage data in our opaque byte array. This in turn means that any vtable we're left with will do unspeakable evils if the program is allowed to continue to the point where one of its entries are called. Even the monster's destructor is unsafe as this point!

Possible solutions to this problem include:

I don't like any of these solutions much, but I find myself leaning towards the final option. It places the least burden on the clients of monster and only requires special-case behaviour in this one place, as opposed to multiple null-checks for the 2nd option.

Here's what it might look like:

struct null_monster
{
    void attack(player &) { }
};

// ...

class monster
{
    public:
        // ...

        monster &operator= (const monster &rhs)
        {
            v->destructor(s);
            v = rhs.v;
            try
            {
                v->copy_constructor(s, rhs.s);
            }
            catch (...)
            {
                // Leave the monster in a valid (if useless) state
                v = &monster_vtable_for<null_monster>::instance;
                new (s.opaque) null_monster(); // placement new, won't throw.
                throw;
            }
            return *this;
        }
};

Adding a dynamic_cast equivalent

A few other features are needed to actually make this usable in the real world. Though I think it's good practice to avoid casting wherever possible, there are inevitably a few occasions where a dynamic_cast can save the day. We can add another function to our vtable in order to support a casting system that's equivalent to dynamic_cast.

struct monster_vtable
{
    void (*destructor)(monster_storage &self);
    void (*copy_constructor)(monster_storage &self, const monster_storage &source);
    void (*throw_this)(const monster_storage &self); // <-- ** NEW **

    void (*attack)(monster_storage &self, player &target);
};

template<typename M>
struct monster_vtable_for
{
    // ...

    static void throw_this(const monster_storage &self) { throw ptr(self); } // <-- ** NEW **

    // ...
};

template<typename M>
const monster_vtable monster_vtable_for<M>::instance = {
    &monster_vtable_for<M>::destructor,
    &monster_vtable_for<M>::copy_constructor,
    &monster_vtable_for<M>::throw_this, // <-- ** NEW **
    &monster_vtable_for<M>::attack
};

This modification is possibly somewhat unexpected, as on the face of it, it appears to have little to do with casting! However, we can make good use of the fact that it's possible to catch a pointer-to-derived as a pointer-to-base:

class monster
{
    public:
        // ...

        template<typename T>
        const T *dyn_cast() const
        {
            try { v->throw_this(s); }
            catch (const T *p) { return p; }
            catch (...) { } // type of object in opaque buffer isn't related to T
            return 0;
        }


        // A non-const overload is left as an exercise for the reader ;)
        // ...
};

dyn_cast() can be used like so:

monster m(ogre(1004));
assert( m.dyn_cast<ogre> != 0);
assert( m.dyn_cast<poltergeist> == 0);

It's also worth pointing out that if we only want to check for specific type (rather than for the specific type or a derived type), we can do without adding a new vtable entry, as we can actually use the vtable pointers themselves to check for type equivalence:

class monster
{
    public:
        // ...

        template<typename T>
        bool holds() const { return v == &monster_vtable_for<T>::instance; }

        // ...
};

// ...

monster m(ogre(8764));
assert(m.holds<ogre>());
assert(!m.holds<poltergeist>());

For a more robust implementation, it's probably worth using boost::remove_const<T>, as &monster_vtable_for<T>::instance will give you a different address to &monster_vtable_for<const T>::instance !

Adding comparison operators

We can use the same compare-the-vtable-pointers trick to help implement comparison operators. For example, to add an equals operator, we first make the following additions to the vtable classes:

struct monster_vtable
{
    // ...

    bool (*equal)(const monster_storage &lhs, const monster_storage &rhs);

    // ...
};

template<typename M>
struct monster_vtable_for
{
    // ...

    static bool equal(const monster_storage &lhs, const monster_storage &rhs)
    {
        // We're assuming that each type of monster provides an 
        // operator== implementation.
        return *ptr(lhs) == *ptr(rhs);
    }

    // ...
};

template<typename M>
const monster_vtable monster_vtable_for<M>::instance = {
    // ...
    &monster_vtable_for<M>::equal,
    // ...
};

monster_vtable_for<M>::equal() assumes that each of our monster types implements an operator==. For example:

bool operator== (const ogre &lhs, const ogre &rhs)
{
    return lhs.health == rhs.health;
}

Now in the monster class, we can define an operator== that uses the vtable pointers to check for type-equivalence, before dispatching the actual equality comparison through the vtable:

class monster
{
    public:
        // ...

        bool operator== (const monster &rhs) const { return v == rhs.v && v->equal(s, rhs.s); }
        bool operator!= (const monster &rhs) const { return !(*this == rhs); }
};

Compared to what we'd have to do with regular polymorphic hierarchies, I think this is actually quite clean. Of course, other comparison operators can be added in the same way (less-than, greater-than, etc).

Conclusion

For particular situations, I think this technique could prove very useful, especially given a polymorphic hierarchy with a fixed number of small-sized classes. But it's not without its drawbacks.

Pros:

Cons:

Downloads

Footnotes
  1. depending on the quality and characteristics of our allocator []
  2. strictly speaking, I do not believe this approach is portable as far as the standard is concerned, but should work everywhere in practice []

Comments

(optional)
(optional)
(required, hint)

Links can be added like [this one -> http://www.mr-edd.co.uk], to my homepage.
Phrases and blocks of code can be enclosed in {{{triple braces}}}.
Any HTML markup will be escaped.