mr-edd.co.uk :: horsing around with the C++ programming language

Stop viral NaNs with smart references

[30th May 2008]

In C++ (and many other languages that use similar floating point models), dividing a floating point object by 0 will yield one of three special values: inf, -inf, or nan (Not A Number).

#include <iostream>
#include <ostream>

int main()
{
    using namespace std;
    cout << (1.0 / 0.0) << endl;  // inf
    cout << (-1.0 / 0.0) << endl; // -inf
    cout << (0.0 / 0.0) << endl;  // nan
    return 0;
}

Dividing by 0 makes no mathematical sense, and yet there's nothing stopping us from writing it in C++ code — usually by accident, when normalizing a zero-vector for example — and so some value must be assigned to the result of such an operation.

What's slightly horrifying is that once you have one of these special values in your code, they begin to spread. Multiplying a floating point value by nan gives you nan and it applies to the other arithmetic operators too. Most operations involving ±inf also result in ±inf, too[1]. In the blink of an eye your system could become flooded with meaningless garbage.

What can we do to mitigate this problem?

The nice thing about primitive types

Before I present a little tool that will stop these values from spreading, let me share with you a quick observation.

One feature I have often wanted in C++ is the ability to overload the . operator in order to create smart references. This would be nice for value_ptr because then I could make it act just like the polymorphic object it wraps.

Alas, we can't overload the . operator. But for primitive types, this doesn't matter. Since they have no members, we don't need to select members using the a.b syntax.

So it's possible to create a class that supports all the operations of a primitive type, but has extra code injected at key points — in the copy constructor, or assignment operator, for example. We can put this to good use by checking for nans and infs when performing arithmetic and making assignments. This will catch the mathematical errors that generate these values at the earliest possible moment, before weirdness starts to propagate.

An example: protecting a matrix

Let's say that you have a matrix class with the interface shown here:

class matrix
{
    public:
        matrix(std::size_t height, std::size_t width);
        double &operator() (std::size_t i, std::size_t j);
        const double &operator() (std::size_t i, std::size_t j) const;

    // etc ...
};

So to set the element in the 42nd row and 12th column to 0.5, you would write:

matrix m(100, 100);
m(42, 12) = 0.5;

But once you get a single nan in your matrix, it will spread like wild-fire when performing multiplications, decompositions and other common matrix operations.

One problem with the current interface is that we can't check that sensible values are assigned to elements of the matrix. We could change the interface so that we would have to call m.set(42, 12, 0.5) instead, but that's much less natural. By using a smart reference, however, we can keep our interface and inject some code to vet the incoming values:

template<typename T>
class assignment_checking_ref
{
    public:
        assignment_checking_ref(T &target) : target_(&target) { }
        operator T &() const { return *target_; }
        T *operator &() const { return target_; }
        assignment_checking_ref &operator= (T rhs) const
        {
            check_nan_and_inf(rhs);
            *target_ = rhs;
            return *this;
        }

        // Similar checks in these:
        assignment_checking_ref &operator/= (T rhs) const;
        assignment_checking_ref &operator*= (T rhs) const;
        assignment_checking_ref &operator+= (T rhs) const;
        assignment_checking_ref &operator-= (T rhs) const;
    private:
        T *target_;
};

class matrix
{
    public:
        matrix(std::size_t height, std::size_t width);
        assignment_checking_ref<double> operator() (std::size_t i, std::size_t j);
        const double &operator() (std::size_t i, std::size_t j) const;

    private:
        // ...
};

Now whenever somebody writes the following code, the value on the right hand side of the assignment is checked by check_nan_and_inf(rhs) inside assignment_checking_ref's assignment operator.

matrix m(100, 100);
m(42, 12) = 0.0 / 0.0; // BOOM! Assertion, exception, or something to that effect

Note that by providing enough operator overloads, we can make the fact that an assignment_checking_ref is returned completely invisible to the client of the matrix class. All the code they wrote before still works (after recompiling and relinking). Even the address-of operator (&) has been overloaded to take care of the case where people take the address of an element of the matrix. Indeed, this is one of the few cases when overloading this operator may actually be kosher.

Now, this code will have a performance hit if you're working with matrices rather heavily, but it would be simple to revert to the direct assignment code in release builds by using a #if here or there.

Checking for strange values

Curiously, nan does not compare equal to itself. Indeed it is the only value, v, such that v != v. Many standard library implementations come bundled with isnan() and isinf() functions/macros, but some still don't as these facilities have only been mandated by more recent standards. Fortunately, we can do the checks quite easily ourselves:

So the body of the check_nan_and_inf() function might be:

void check_nan_and_inf(double x)
{
     assert(x == x);
     assert(x != 1.0 / 0.0);
     assert(x != -1.0 / 0.0);
}

It's worth creating three overloads of this function for float, double and long double arguments and tucking it in to its own translation unit. If you make it a template function, you'll find that you get compiler warnings each time the template is instantiated (because many compilers warn when a divide-by-zero situation can be found statically)!

One step further

The assignment_check_ref<> code is good, but it's still breakable. Some naughty so-and-so might do the following:

double &element = m(42, 12);
element = x; // no checking performed, assigns directly to the double

Of course, a client wouldn't have intentionally written code like this to work around our defenses, but it is occasionally useful to get a reference to mutable data in this way.

For belt-and-suspenders protection, we need something a little more heavyweight — a type that wraps a primitive and allows us to inject checks on any operation.

For such a type we have to overload pretty much every operator there is. I haven't bothered doing this, because I consider it overkill, but by using a traits mechanism we can also eliminate the need to store a hefty number of callbacks inside the wrapper object. I'm thinking something along the lines of:

template<typename T>
struct default_check_traits
{
    void check_construction(T source) { }
    void check_assign(T current, T source) { }
    void check_in_place_addition(T current, T source) { }
    // lots more for the other operators ...
};
template<typename T, typename CheckTraits = default_check_traits<T> >
class checked
{
    public:
        typedef CheckTraits traits;
        checked(T value) : value_(value) { traits::check_construction(value); }

        // Note the non-explicit constructor means we can assign
        // Ts to checked<T>s via assignment
        checked &operator= (checked other)
        {
            traits::check_assign(value_, other.value_);
            value_ = other.value_;
            return *this;
        }

        checked &operator+= (checked other)
        {
            traits::check_in_place_addition(value_, other.value_);
            value_ += other.value_;
            return *this;
        }

        // lots more for the other operators...

    private:
        T value_;
};

Here you can supply a custom traits class that performs the checks at the points you think necessary. You derive from default_check_traits<> to get fall-back check implementations, if you don't want to implement every single check function.

Once we have such a class, we could use a checked<double> inside our matrix class, for example, as the underlying type of element.

But as I said earlier, I think this is probably over-the-top. Maybe I'll kick myself when I start seeing nans before my eyes again, though…

Footnotes
  1. except quotients such as 10 / inf, which results in 0 []

Comments

Rydinare

[11/07/2008 at 12:41:00]

Interesting post and I did want to compliment you on having an awesome blog. I totally see where you’re going with this, although when I did something to this effect, I implemented things with a slightly different approach. The one concern was too many implementation details slipping into the interface in cases where that information might be confusing or not helpful. What I had developed some years back, was template wrappers around primitives, as well. Upon any operation in the primitive wrapper where a value might be changed, I did a check to check it’s validity, based on template arguments to the class which specified a valid range. Then, in a class like matrix, the member would be of this safer type. Here’s a quick (certainly not optimized) example for a dynamic matrix:

template<typename T>
class RangedPrimitive
{
public:
... provide overloads for every operation that can be done on a primitive ...
... upon each overload, if there is a possibility of being outside the range, check and throw an exception if so ...

    // one example to demonstrate
    RangedPrimitive& operator=(const T& value)
    {
        if (!TRangePolicy::isValid(value))
        {
            throw Constraint_Error();
        }

        internalValue_ = value.internalValue_;

        return *this;
    }
};

template<typename T>
class Matrix
{
public:

    void set(size_t row, size_t column, const T& value)
    {
        elements_.at(row, column) = value; // This line would throw based if the range policy of the element was violated
    }

private:
    std::vector<std::vector<RangedPrimitive<T, RangeDisallowNans > > > elements_;
};

I got this idea because at my first job, I did a lot of Ada, and Ada had the idea of valid ranges and subranges built into the language. It definitely made things a lot easier, because there was a defined exception called Constraint_Error that occurred whenever an attempt was made to set something outside the range. Ada compilers also had the ability to check for any range violations they could figure out statically. So, for example, if you did something like the following:

rangedValue = 0.0 / 0.0;

you would actually get a compile error. You can simulate this in C++ by providing template versions of all methods that take a value rather than a type and remembering to call those, but I think that gets a bit messy.

(optional)
(optional)
(required, hint)

Links can be added like [this one -> http://www.mr-edd.co.uk], to my homepage.
Phrases and blocks of code can be enclosed in {{{triple braces}}}.
Any HTML markup will be escaped.