mr-edd.co.uk :: horsing around with the C++ programming language

Looking inward

[1st May 2008]

You often hear people say that they would like more introspection capabilities in C++. It would certainly help when creating mock objects and such-like. On the other hand, if you think you need that kind of thing, you're probably better off using a different language in my opinion, or at least embedding a scripting language within your application.

With that said, there are a few tricks that can actually get you quite far. The Boost.TypeTraits library provides a host of meta-functions that help you to find out the properties of a given type. Similarly, Loki also has some handy meta-tools. In this post, however, I really want to look in to detecting whether or not different kinds of functions and inner types exist.

Checking for class-scope typedefs

A lot of the C++ meta-programming techniques that are floating about today were cobbled together using inventive and cunning combinations of language features that weren't explicitly envisioned during the construction of the standard.

We were given one tool, however. The standard explicitly defines the concept of SFINAE, which is an acronym for Substitution Failure Is Not An Error. It's a bit of a mouthful and not all compilers support it[1], but it comes in handy in a bunch of places.

So what does SFINAE do? It effectively gives you a little bit of extra control with respect to overload resolution.

Let's have a look at a quick example.

#include <vector>
#include <iostream>

template<typename T>
struct has_iterator
{
    typedef char small;
    typedef char (&big)[2];

    template<typename U>
    static big test(...);

    template<typename U>
    static small test(typename U::iterator *);

    static const int value = (sizeof test<T>(0) == 1);
};

int main()
{
    // Does the type 'int' have an internal iterator typedef?
    std::cout << has_iterator<int>::value << '\n';

    // Does the type 'std::vector<int>' have an internal iterator typedef?
    std::cout << has_iterator<std::vector<int> >::value << '\n';

    return 0;
}

This program will output the following on a conforming compiler:

0
1

It uses what I like to call the big/small test, which crops up all over the place in C++ meta-programming and makes good use of SFINAE. Lets look at how it works.

Inside the has_iterator template class there are two types, small and big, which are defined in such a way as to have sizes of 1 byte and more-than-1 byte respectively.

Next we have two overloaded static functions in there, called test(). Note that each returns a type of a different size and is able to take a single argument.

When test<T>(0) is called, something halfway-magical happens. During overload resolution, the second definition of test<T> is looked at first as it is more specific, colloquially speaking.

Now, if the type T in question doesn't have an internal type called iterator, this overload of test<T> doesn't make any sense. One might imagine at first that the compiler would spit out an error, but SFINAE explicitly allows the compiler to ignore this non-sensical overload and carry on looking, at which point it will come to the first overload of test<T>, which is always valid.

So if and only if T::iterator makes sense, test<T>(0) returns an object that is 1 byte in size. So we put the function call in to a sizeof test and define the value constant appropriately.

One important thing to note is that the expression passed to the sizeof operator is not evaluated; only the size is deduced. This means that we don't actually have to provide definitions of the test functions.

Checking for member functions

This one works on the same principle, but you see it used less often. SFINAE here is used inside a template argument, rather than a regular function argument.

Sometimes it's useful to know if a class has a member function of a particular name. I toyed with detecting the presence of a clone() method when I was writing value_ptr, but in the end I used something else (more on that later).

Let's see how we could detect that a class has a begin() member function.

#include <vector>
#include <iostream>

template<typename T, typename R>
struct has_begin_method
{
    typedef char small;
    typedef char (&big)[2];

    template<typename U, R (U::*)(void)>
    struct sfinae { };

    template<typename U> static small test(sfinae<U, &U::begin> *);
    template<typename U> static big test(...);

    static const bool value = sizeof test<T>(0) == sizeof(small);
};

int main()
{
    // Does the type 'int' have a begin() method?
    std::cout << has_begin_method<int, void>::value << '\n';

    // Does the type 'std::vector<int>' have a begin() method?
    std::cout << has_begin_method<std::vector<int>, std::vector<int>::iterator>::value << '\n';

    return 0;
}

The catch with this technique is that you must know a few things to create the test:

For example, has_begin_method<const std::vector<int>, std::vector<int>::const_iterator>::value is false because has_begin_method looks for a non-const member function.

However, we could also add another inner class, sfinae_const perhaps, that accepts a const member function and another overload of test():

template<typename T, typename R>
struct has_begin_method
{
    typedef char small;
    typedef char (&big)[2];

    template<typename U, R (U::*)(void)>
    struct sfinae { };

    template<typename U, R (U::*)(void) const>
    struct sfinae_const { };

    template<typename U> static small test(sfinae<U, &U::begin> *);
    template<typename U> static small test(sfinae_const<U, &U::begin> *);
    template<typename U> static big test(...);

    static const bool value = sizeof test<T>(0) == sizeof(small);
};

Now has_begin_method<const std::vector<int>, std::vector<int>::const_iterator>::value is true, too.

You could add further machinery in there to check for member functions that return not only R, but also allow R & and const R &, if need be.

Also, I've designed the code here to look for a regular member function, but you can make some simple adjustments to look for static member functions, too.

Checking for namespace scope functions

We can also check for the existence of namespace-scope functions (free functions). I ended up using a similar technique in the implementation of value_ptr. If the user provided a namespace-scope clone() function with the correct signature, value_ptr picked it up and used it.

So let's say we want to detect if there exists a free function, apply() that can be called with an argument of type T. We can create a meta-function to detect this like so:

#include <iostream>

struct not_implemented { };

template<typename U>
static not_implemented apply(const U &);

template<typename T>
struct implements_apply
{
    typedef char small;
    typedef char (&big)[2];

    static small test(not_implemented);
    template<typename U> static big test(const U &);

    static T t_;

    static const bool value = sizeof test(apply(t_)) == sizeof(big);
};

int apply(int x) { return x; }

int main()
{
    std::cout << implements_apply<int>::value << '\n';       // prints 1
    std::cout << implements_apply<unsigned>::value << '\n';  // prints 0

    return 0;
}

Again, the same basic big/small test is used. This particular technique works by creating a fallback template function that will take an argument of any type. If this function gets called, we know that the apply() function is implemented to the T in implements_apply<T>.

There are a couple of shortcomings. The first is that it won't work with free functions that return void. A return type is needed because we need to pass the result on to another function (one of our test()s).

Also, if you need to make an implements_XXX<> for a function XXX() that takes more than 1 argument, you're going to have to add some extra template parameters. That's not too much of a big deal, though. I suspect that you could even make a implements_XXX_taking_four_arguments_and_whose_third_argument_is_a<T> meta-function (for want of a better name!).

One thing that I found particularly annoying with this technique, however, is that it checks the static type. For example, if I defined a function bool apply(const base &), then implements_apply<base>::value would be true as expected, but implements_apply<derived>::value would be false.

There's a partial solution to this:

#include <iostream>

struct convertible_from_anything
{
    template<typename T>
    convertible_from_anything(const T &);
};

struct not_implemented { };
static not_implemented apply(convertible_from_anything);

template<typename T>
struct implements_apply
{
    typedef char small;
    typedef char (&big)[2];

    static small test(not_implemented);
    template<typename U> static big test(const U &);

    static T t_;

    static const bool value = sizeof test(apply(t_)) == sizeof(big);
};

Now the overload resolution for apply(d) for some derived object, d, will pick apply(const base &), rather than apply(convertible_from_anything).

That's better, but as I said, it's only a partial solution. Why? Non-explicit constructors is why! Suppose we have a struct x that has a conversion constructor from a double:

struct x
{
    x(double) { }
};

bool apply(const x &);

Now implements_apply<x>::value results in a compiler error because overload resolution becomes ambiguous between apply(const x &) and our fallback apply().

If you have been careful about giving your types explicit constructors where necessary, you'll probably be ok. But if you're writing a library that is designed to interact with other people's code, as I was with value_ptr, this is probably something you need to be aware of.

The solution I came up with when checking for a clone() free function for use with a value_ptr<T>, was to instead check for a clone(const T *) (note the *). Fortunately, I was able to control the signature of the function I was testing for, else I wouldn't have been able to do this. Pointers don't have constructors, explicit or otherwise, so I could work around this little annoyance.

Uses

There are a bunch of uses for the stuff I've used here. The techniques aren't as flexible or all-encompassing as the introspection available in your average dynamic language, or even those available in something like Java, but quite often they'll get you far enough.

I've already mentioned value_ptr<> as an instance in which I've used these techniques. Another cool thing to try would be to create a function hash(x) which returns a hash of the object x, regardless of what x is.

The idea would be to try to create some kind of iterator that enumerates the logical bytes in the object. For example hash(x) where x is an int would generate a hash using the bytes in x itself. If x was something like a std::list<double> however, hash would detect that x's type has begin() and end() member functions and use those to iterate over the elements of the list, and then the bytes in each element, to generate a hash.

Clients could make a class X hash()-able by implementing hash_begin(const X &) and hash_end(const X &) to return iterators that enumerate the bytes in the custom type.

Of course, this could be applied recursively (at compile time) for a structure such as vector<list<set<X> > >!

You could also find out how much memory a data structure is using by employing a similar technique. It's something I'd like to try at some point.

Footnotes
  1. including some of the more recent versions of Borland's compiler []

Comments

(optional)
(optional)
(required, hint)

Links can be added like [this one -> http://www.mr-edd.co.uk], to my homepage.
Phrases and blocks of code can be enclosed in {{{triple braces}}}.
Any HTML markup will be escaped.