mr-edd.co.uk :: horsing around with the C++ programming language

Undo/redo part 3: combining actions

[8th February 2011]

Previously in this undo/redo series we've looked at:

  1. how to set up a basic action interface and a simple undo_stack class
  2. how exception safety guarantees of actions affect the undo_stack's implementation

In this episode, I'd like to look at how to efficiently combine two or more actions in to a single compound action and why we might want to do so.

I guess this topic wanders back in to the realms of what others articles have discussed before when looking at undo/redo. We'll see however, that exception safety will come back in to the picture briefly, which is something all other articles I've read seem to neglect.

Why might we want to combine actions?

First, let's think about why we might want a mechanism to combine multiple actions in to a single compound action.

If we're creating a text editor or word processing application, we might want Edit > Undo to undo the last flurry of key presses we made, rather than only the very last character insertion. Various applications exhibit this behaviour. In fact, the Firefox HTML form in to which I'm typing this article implements this system.

So, we might have a glyph_insertion class that implements the action interface. Supposing we already have document and unicode_glyph types, glyph_insertion might look like this:

struct glyph_insertion : action
{
    document &doc;
    const unicode_glyph inserted;
    const std::size_t line;
    const std::size_t column;

    glyph_insertion(document &doc,
                    unicode_glyph inserted, 
                    std::size_t line, std::size_t column) :
        doc(doc),
        inserted(inserted),
        line(line),
        column(column)
    {
    }

    virtual void redo() // overrides action::redo()
    {
        doc.insert_glyph(inserted, line, column);
        doc.set_cursor_position(line, column + 1);
    }

    virtual void undo() // overrides action::undo()
    {
        doc.erase_glyph(line, column);
        doc.set_cursor_position(line, column);
    }
};

So as the user is hammering away on their keyboard, lots of these actions are being created. In the event handler, we might use a timer to decide which glyph_insertions we'd like to combine so that the undo-stack sees them as a single document mutation.

This is just one instance in which we might want to combine actions. Another situation we often discuss at work as a thought experiment is the dragging of an object around in some kind of drawing application. Here, lots of little move_object actions might be generated and yet we only want a single action that represents the net movement to appear in the undo stack.

A simple multi_action class

So, we've decided that we need to combine multiple actions in to a single element to be placed on the undo_stack. How might we perform this combining operation? The most straight-forward approach is to create a multi_action class that has a container of actions:

// Recall that action_ptr is a typedef for boost::shared_ptr<action>

class multi_action : public action
{
    public:
        multi_action(const action_ptr &first) { add(first); }
        
        void add(const action_ptr &another) { subactions.push_back(another); }

        virtual void undo() // overrides action::undo()
        {
            std::size_t a = subactions.size();
            while (a != 0)
                subactions[--a]->undo();
        }

        virtual void redo() // overrides action::redo()
        {
            std::size_t a = 0, n = subactions.size();
            while (a != n)
                subactions[a++]->redo();
        }

    private:
        std::vector<action_ptr> subactions;
};

Nothing particularly fancy is going on here. The multi_action's undo() method calls undo() on all the actions it contains (in reverse order) and redo() does the opposite.

So returning to our text-editor/word-processor situation, our key-press-handling code might look a bit like this:

// Something like an MVC controller
class controller
{
    public:
        // called on response to a key press while editing pane is active
        void on_insertion(key k);

        // called when user hasn't typed anything for a while
        void on_insertion_timeout();

        // called when user clicks undo in the edit menu (or Ctrl-Z)
        void on_undo()

        // ...

    private:
        action_ptr make_glyph_insertion_action(key k);

        void extend_insertion_timeout();

        // ...

    private:
        boost::shared_ptr<multi_action> inserts;
        undo_stack undos;
        // ...
};

void controller::on_insertion(key k)
{
    action_ptr gia = make_glyph_insertion_action(k);

    if (inserts) inserts->add(gia);
    else inserts.reset(new multi_action(gia));

    extend_insertion_timeout();
}

void controller::on_insertion_timeout()
{
    if (inserts)
    {
        undos.push(inserts);
        inserts.reset();
    }   
}

void controller::on_undo()
{
    on_insertion_timeout(); // flush pending multi_action
    assert(undos.can_undo());
    undos.undo();
}

Reasonably simple and I hope not too surprising. However, this system is not without its problems. The first of which is that what we have so far is really rather inefficient in terms of memory usage.

For each glyph inserted, we're storing quite a bit of information (perhaps in addition to any member variables we might have in the action base class). It would be nice if we could condense all the generated glyph_insertion actions in to a single multi_glyph_insertion action that might hold a string of unicode_glyphs, thus reducing the memory usage of the bulk insertion from

N * (sizeof(book keeping) + sizeof(unicode_glyph))

to

sizeof(book keeping) + N * sizeof(unicode_glyph)

One way of achieving this, is to change the type of controller::inserts from a shared_ptr<multi_action> in to an action_ptr and then use dynamic_casts to see if the 'incoming' action is to be combined with a glyph_insertion or a multi_glyph_insertion. In other words, controller::on_insertion() becomes:

void controller::on_insertion(key k)
{
    if (!inserts)
        inserts = make_glyph_insertion_action(k);
    else
    {
        multi_glyph_insertion *mgi = 
            dynamic_cast<multi_glyph_insertion *>(inserts.get());

        if (mgi)
        {
            // assume multi_glyph_insertion has a 'string' member
            mgi->string.push_back(glyph_for_key(k));
        }
        else
        {
            assert(typeid(*inserts) == typeid(glyph_insertion));

            glyph_insertion &gi = static_cast<glyph_insertion &>(*inserts);
            boost::shared_ptr<multi_glyph_insertion> replacement(
                new multi_glyph_insertion
            );

            replacement->string.push_back(gi.inserted);
            replacement->string.push_back(glyph_for_key(k));
            inserts = replacement;
        }
    }

    extend_insertion_timeout();
}

Though we've ended up with something that's more space-efficient, I'd say this code is getting a little unruly. Perhaps an easier way would be forget about the glyph_insertion action entirely and use only the multi_glyph_insertion class, but we'll continue in the vein of the above modification for the sake of discussion as other applications may not have similar luxuries — they'll have lots of different kinds of action that can and should be combined in to a more efficient representation. So let's see if we can come up with a more general system.

Solution 1: adding an assimilate() method to the action interface

Here the idea is to give our action base class a new virtual function that takes a reference to another action, which it should try to merge with itself:

class action
{
    public:
        // merge *other in to this action.
        // return false if not possible.
        virtual bool assimilate(const action_ptr &other) { return false; }

        // ... everything else as before
};

A bool is returned which the calling code should use to decide if a multi_action should be created, for those situations where there's no better way of merging the two actions involved.

This is the approach that Qt encourages in its undo framework.

So what might an implementation of assimilate() look like in a derived class? Let's look again at our text editor example. multi_glyph_insertion::assimilate() should be able to assimilate a glyph_insertion action:

bool multi_glyph_insertion::assimilate(const action_ptr &other)
{
    const glyph_insertion *gi = 
        dynamic_cast<const glyph_insertion *>(other.get());

    if (gi != 0)
    { 
        string.push_back(gi->inserted);
        return true;
    }

    // we can cope with multi_glyph_insertions too:
    const multi_glyph_insertion *mgi =
        dynamic_cast<const multi_glyph_insertion *>(other.get());
    
    if (mgi != 0)
    {
        string += mgi->string;
        return true;
    }

    // we can't assimilate the action
    return false;
}

Unfortunately, we're again performing some ugly type-casting. However, we can tuck the brittle casting parts away. Suppose we have two functions that can be used to merge the actions specified as arguments:

void merge_glyphs1(multi_glyph_insertion &target, 
                   const glyph_insertion &incoming)
{
    target.string.push_back(incoming.inserted);
}

void merge_glyphs2(multi_glyph_insertion &target, 
                   const multi_glyph_insertion &incoming)
{
    target.string += incoming.string;
}

Note that the arguments to these functions are references to the static types of the derived actions and not to the base action. Now, with a little trickery, we can re-write multi_glyph_insertion::assimilate() as follows:

bool multi_glyph_insertion::assimilate(const action_ptr &other)
{
    return merge_actions(*this, *other)
                               ^ merge_glyphs1
                               ^ merge_glyphs2;
}

And, here's what merge_actions() looks like:

// merge_actions() will return one of these
template<typename ActionType>
class action_merger
{
    public:
        action_merger(ActionType &target, const action &incoming) :
            target(&target),
            incoming(&incoming),
            merged(false)
        {
        }

        operator bool () const { return merged; }

        template<typename Other>
        action_merger &operator^ (void (*mergefunc)(ActionType &a1, const Other &a2))
        {
            if (merged) return *this;

            const Other *rhs = dynamic_cast<const Other *>(incoming);
            if (rhs)
            {
                mergefunc(*target, *rhs);
                merged = true;
            }
            return *this;
        }

    private:
        ActionType *target;
        const action *incoming;
        bool merged;
};

template<typename ActionType>
action_merger<ActionType> merge_actions(ActionType &target, const action &incoming)
{
    return action_merger<ActionType>(target, incoming);
}

If you think the overloading of operator^ is a little bit too cute, you could be boring and use a normal method name instead like so:

bool multi_glyph_insertion::assimilate(const action_ptr &other)
{
    return merge_actions(*this, *other)
                   .with(merge_glyphs1)
                   .with(merge_glyphs2); // ZZZZzzzz ;)
}

But however you choose to phrase this mechanism, the point is that we're only doing the casting in a single place, which servers to reduce both the ugliness of the code and the chance for casting- or null-pointer-related mistakes.

With that now fixed, there are still a couple of things that I don't like about this approach.

The first is the difficulty in extending the system. It is up to the implementer of each derived action to survey all other actions in the system and use the appropriate merge functions in their assimilate() method. If new types of action are subsequently created, we have to remember to revisit these method definitions to see if an update is needed.

My second gripe is that no allowance is made for the situation where the best way of combining two actions of derived types A1 and A2 respectively, is to create a new action of type A3. We could modify the call signature of action::assimilate() to address this point, allowing it to return a possibly-null-valued action_ptr instead of a bool. But in doing this, we'd usually have to do more allocation and copying than that required by assimilate() in its current form. Perhaps not a particularly big deal though, all things considered.

Solution 2: having a registry of combiner functions

This next approach will address these issues in quite a convincing fashion, I think. But as we'll see, it won't be without problems of its own.

If we wander out of undo/redo-land for a moment, we can can see that what we're trying to do here is solve a particular case of the multi-method problem; C++ only supports dynamic method dispatch based on the this argument of a method call. Anyone who's tried to make a collision-detection system for a bunch of different types of polymorphic shapes will probably have encountered a similar situation there.

One way of working around the lack of multi-methods in C++ is to create some kind of registry that maintains a mapping from pairs of type_infos to the appropriate function to call.

Before looking at some of the implementation details, here's how we might use it:

// First, we register our merge functions with a combiner object:
action_combiner combine;

combine.add(merge_glyphs1);
combine.add(merge_glyphs2);


// Later on when we have two action_ptrs that need merging:

action_ptr merged = combine(a1, a2);

To implement this, we have action_combiner::add() look at the arguments of the function it is given and store a mapping from the types of those arguments to the function itself. So we first need a little wrapper around std::type_info that's copyable and less-than-comparable, in order to use it as a key in a map:

namespace impl
{
    class type_info2 // for want of a better name
    {
        public:
            // conversion constructor
            type_info2(const std::type_info &info) : 
                info(&info) 
            {
                // All std::type_info objects live in static storage, so 
                // holding a pointer to our argument is fine.
            }

            bool operator< (const type_info2 &rhs) const
            {
                // Could use std::less<const std::type_info *> here, 
                // but take care with dynamic libraries
                return info->before(*rhs.info);
            }

        private:
            const std::type_info *info;
    };

} // impl

Rather than having action_combiner::add() store the registered callbacks directly, we'll be wrapping them in objects derived from the following little interface. The reason for this will become apparent shortly:

namespace impl
{
    struct mergeop
    {
        virtual ~mergeop() { }
        virtual action_ptr merge(const action_ptr &a1, const action_ptr &a2) const = 0;
    };

    typedef boost::shared_ptr<mergeop> mergeop_ptr;

    // TODO: some derived implementations of mergeop ...

} // impl

Our action_combiner now looks like this:

class action_combiner
{
    public:
        typedef std::pair<impl::type_info2, impl::type_info2> type_pair;

        // Support functions that merge the second action in to the first.
        template<typename Target, typename Incoming>
        void add(void (*mergefunc)(Target &, const Incoming &))
        {
            const type_pair key(typeid(Target), typeid(Incoming));
            ops[key] = /* TODO: an impl::mergeop */ ;
        }
        
        // Support functions that take two actions and make an entirely new one.
        template<typename First, typename Second>
        void add(action_ptr (*mergefunc)(const First &, const Second &))
        {
            const type_pair key(typeid(First), typeid(Second));
            ops[key] = /* TODO: an impl::mergeop */ ;
        }

        // call operator does the combining.
        action_ptr operator() (const action_ptr &a1, const action_ptr &a2)
        {
            // TODO
        }

    private:
        std::map<type_pair, impl::mergeop_ptr> ops;
};

Now before I fill in those TODOs, I just want to mention that combiner has two overloaded add() methods.

The first overload registers a callback of the kind we've seen before — one that merges some incoming action into a target action. Again, the callback's arguments should be the full static types of the actions involved.

The second overload registers a callback that takes two derived action objects of types First and Second, and returns an action_ptr. In this case, the callback isn't expected to modify either of its arguments. This allows us to merge two actions of arbitrary derived type in to another action of some other derived type, possibly quite unrelated to the constituent actions from which it was created.

In each of the add() overloads, we've got a TODO to fill in. In each case, we'll be creating a derived implementation of the impl::mergeop interface that uses the specified callback in order to combine the supplied actions.

So, for the first overload we have merge_by_assimilation<Target, Incoming>:

namespace impl
{
    template<typename Target, typename Incoming>
    struct merge_by_assimilation : mergeop
    {
        typedef void (*callback_type)(Target &, const Incoming &);

        callback_type callback;

        merge_by_assimilation(callback_type callback) : callback(callback) { }

        virtual action_ptr merge(const action_ptr &a1, const action_ptr &a2) const
        {
            Target &target = dynamic_cast<Target &>(*a1);
            const Incoming &incoming = dynamic_cast<const Incoming &>(*a2);

            callback(target, incoming); // merge a2 in to a1
            return a1;
        }
    };

} // impl

And for the second overload we have hands_off_merge<First, Second>:

namespace impl
{
    template<typename First, typename Second>
    struct hands_off_merge :  mergeop
    {
        typedef action_ptr (*callback_type)(const First &, const Second &);

        callback_type callback;

        hands_off_merge(callback_type callback) : callback(callback) { }

        virtual action_ptr merge(const action_ptr &a1, const action_ptr &a2) const
        {
            const First &first = dynamic_cast<const First &>(*a1);
            const Second &second = dynamic_cast<const Second &>(*a2);

            return callback(first, second); // merge a1 and a2 in to a new action
        }
    };

} // impl

And finally, the action_combiner::operator(). You can probably guess how this is implemented by now, but one nice trick we can add is to return a multi_action if no appropriate mergeop has been created:

action_ptr action_combiner::operator() (const action_ptr &a1, const action_ptr &a2)
{
    const type_pair key(typeid(*a1), typeid(*a2));

    // Do we have a mergeop for the types of action involved?
    std::map<type_pair, impl::mergeop_ptr>::iterator f = ops.find(key);
    if (f != ops.end())
        return f->second->merge(a1, a2); // Yes!

    // No. We'll make a multi_action instead.
    boost::shared_ptr<multi_action> ma = boost::make_shared<multi_action>(a1);
    ma->add(a2);

    return ma;
}

Ok so let's quickly run through the pros and cons in comparison to the first approach. The biggest down side is that we have to register everything up front. If you're really sneaky, you might be able to do some kind of auto-registration system like a number of C++ unit-testing systems do with their tests and suites, but there are lots of subtleties to do with linkage involved there.

Is pre-registration of merge callbacks a lesser evil than having to keep all your action::assimilate() methods up to date as new actions are added? That's for you to decide.

As I promised at the end of the discussion of Solution 1, this 2nd solution provides a way to combine two actions in to an action that may be of an entirely different type to either of the constituent actions. But we've also kept open the option of being able to merge certain actions without having to allocate and fill a new one, in those cases where this makes sense.

In the 2nd solution, we have the action_ptrs to hand (as opposed to the actions) which allows us to do the multi_action-fallback stuff out-the-box, too. It can be bolted on to Solution 1 at the point where assimilate()s are called, but it won't be quite as smooth. Actually, we can take this multi_action-fallback trick even further. If no mergeop is found, we can look to see if a1 is already a multi_action and if so, append a2 to its subactions vector. Or if both a1 and a2 are multi_actions, we can instead append all of a2's subactions to those of a1.

Bonus solution: a combination

There's no reason of course that the two approaches couldn't be mixed. The action_combiner could try calling assimilate() if a mergeop can't be found. If assimilate() also can't help with the merge, only then should a multi_action be constructed.

Exception safety in multi_action

So we've now got a couple of ways of implementing a general and extensible action-merging system, with a little wiggle-room for taste. But whatever we do, it's likely that there will always be cases where we have to fall back to creating a multi_action.

Given that multi_action will be hanging around, I want to take a moment to examine a couple of points with respect to its exception safety. In the previous installment in the series, we looked into the possibility of adding two additional virtual functions to the action base class, specifically for exception safety:

class action
{
    public:
        // ...

        // NEW: do undo()/redo() provide the strong exception safety guarantee?
        virtual bool undo_provides_sg() const { return false; }
        virtual bool redo_provides_sg() const { return false; }

        // ...
};

If these functions are present, they provide the undo_stack class with the ability to decide whether or not it should clear its content when an undo() or redo() operation throws an exception. For the entire scoop, you'll have to read the previous article, but the basic reasoning for this is as follows.

If an undo() call throws an exception, it might leave the document in some intermediate state (in the general case). Even though such a state might be well defined, there's no guarantee that said state is the one expected by the previous action's redo() method. So the only safe thing to do at this point is to clear the undo_stack.

However, if we know that the action's undo() method provides the strong guarantee, we can be sure that the failed undo() operation did not affect the document and we can therefore leave the content of the undo_stack in place.

Ok, recap over. Let's now think about this in the context of a multi_action. Specifically, can we implement multi_action::undo_provides_sg() and multi_action::redo_provides_sg() in a useful way? Unfortunately the answer is that we can't.

To see why, suppose we have two functions, f() and g() that each provide the strong exception safety guarantee. If we compose a function h() that calls f() followed by g(), what can we say about the exception safety properties of h()?

void f(); // strongly exception safe
void g(); // strongly exception safe

void h() // what can we say about this?
{
    f();
    g();
}

If the call to f() completes without throwing but g() subsequently throws, then h() propagates the exception having performed only part of its job. So h() provides only the basic guarantee.

Now substitute f() with action1->redo() and g() with action2->redo() and h() is then equivalent (in terms of side-effects) to executing the redo() method of a multi_action that contains action1 and action2. Thus, a multi_action, in the general case can only provide the basic exception safety guarantee.

If we care enough about this, we could replace undo_provides_sg() and redo_provides_sg() with functions that return a constant indicating the degree of exception safety offered. For example:

enum exception_safety
{
    exceptions_are_fatal,
    basic_guarantee,
    strong_guarantee,
    no_throw_guarantee
};

class action
{
    public:
        // ...

        // NEW: in place of undo_provides_sg()/redo_provides_sg()
        virtual exception_safety undo_guarantee() const { return basic_guarantee; }
        virtual exception_safety redo_guarantee() const { return basic_guarantee; }

        // ...
};

But what does this buy us? If we have a multi_action that contains actions whose redo() methods all provide the strong guarantee, and all but the last action has an undo() method that provides the no-throw guarantee, multi_action::redo() is able to provide the strong guarantee:

void multi_action::redo()
{
    for (std::size_t a = 0, n = subactions.size(); a != n; ++a)
    {
        try { subactions[a]->redo(); }
        catch (...)
        {
            if (redo_guarantee() == strong_guarantee)
            {
                // Roll back before propagating the exception.
                // We can do this because subactions[a]->redo() is known to 
                // provide the strong guarantee and all undo()s of previous
                // actions are no-throw.
                while (a--)
                    subactions[a]->undo();
            }
            throw;
        }       
    }
}

Here's how multi_action::redo_guarantee() would be implemented:

exception_safety multi_action::redo_guarantee()
{
    assert(!subactions.empty());

    exception_safety ug = no_throw_guarantee; // best undo guarantee
    exception_safety rg = no_throw_guarantee; // best redo guarantee

    // loop over all subactions, except the last
    for (std::size_t a = 0, n = subactions.size(); a + 1 < n; ++a)
    {
        ug = std::min(subactions[a]->undo_guarantee(), ug);
        rg = std::min(subactions[a]->redo_guarantee(), rg);
    }

    // factor the guarantee of the final subaction in to rg but not ug.
    rg = std::min(rg, subactions.back()->redo_guarantee());

    if (rg == strong_guarantee && ug == no_throw_guarantee)
        return strong_guarantee;

    return rg;
}

The same kind of thing can be done for multi_action::undo() too, of course. Now, even though this is technically possible, it's quite fiddly and personally I'm not sure it's worth the effort. But the option's there if you really care about holding on to as much undo data as possible in the presence of error conditions[1].

Next time: lifetime management

In the next installment (which will hopefully be a bit shorter!), we'll look at some of the issues surrounding lifetime management of the objects in the data model in the face of an undo/redo system.

Footnotes
  1. Though if you really care that much, you might be better off building your data model on top of a database that provides support for transactions []

Comments

Edward

[24/10/2013 at 16:02:43]

Thanks for writing these posts, I've always thought it strange that this topic seems to be so poorly and infrequently discussed online, considering how long GUI applications have supported it and how it is regarded as a basic feature that 'just ought to be there' yet has so many implications for the implementation of any piece of 'editor' based software. Presumably the big companies like MS, Adobe etc. have sufficient 'institutional knowledge' and experience that newcomers just pick these things up by reviewing the established code bases.

I've puzzled over this for years, as the suggestion to 'just use the Command pattern' (LOL noob etc.) never seemed sufficient compared to the potential complexities of different types of model manipulations that might be required. If a feature of your software is to sort a list of items, how could you undo that sort, except by storing the exact previous order of the items somewhere. Your 'undo' method implementations would easily end up being more complicated than your 'execute' implementation, and you effectively have to write every function twice, once forwards and once backwards.

In the end it seemed that the Clojure style persistent data structure technique offered the best compromise. The 'execute' path is perhaps more complex, as it has to be written in a functional style with immutable state (using something like lenses), but the 'undo' path is essentially free. Regardless of the complexity of the state manipulation involved, it's always a matter of returning to the previous 'root node' of the data model.
Using 'structural sharing' you should be able minimise the amount of data duplication involved in maintaining several different model states in memory.

Actually I dispensed with bulk of the Command pattern entirely, and just labelled each state transition with the text to be used on the undo/redo menu items. You don't need to worry any more about exception safety as there are no state manipulations going in in 'undo' and 'redo', just switching too an existing alternate 'root node', and you also don't need to think about combining small operations into larger ones, as each operation is only and always a transition from one 'root node' to another, regardless of the complexity of changes made to the state. There's no need to ever 'unsort' something, just return to the state that existed before the sorting occurred.

All in all it seems to work pretty well, of course we haven't actually shipped it yet..

(optional)
(optional)
(required, hint)

Links can be added like [this one -> http://www.mr-edd.co.uk], to my homepage.
Phrases and blocks of code can be enclosed in {{{triple braces}}}.
Any HTML markup will be escaped.