mr-edd.co.uk :: horsing around with the C++ programming language

New year's resolution: learn x86 assembly

[20th January 2010]

One reason to use C++ — perhaps the only reason — is performance. I'd certainly rather be using something else if I wasn't writing… um… attempting to write high performance code.

It is evident that C++ authors often feel the need to bend over backwards to write code that they hope will perform well. We aim to avoid copying data about as much as possible. We aim to avoid dynamic allocation. We inline small functions and outline big ones.

If we're going through the trouble of using a language with as many quirks and corner cases as C++ for the sake of performance, we might as well try to get the most out of it, right?

When examples of convoluted hand-optimization land on public forums, cries such as let the compiler do that are often heard. They come from my mouth, sometimes.

But do I and others like me really know what we're talking about? Does the compiler really know best all the time? I know that return-value-optimization can be safely applied to my class, but does the compiler? Can it really deduce that it's ok to replace std::copy with a call to memcpy in this situation? How about removing those virtual function calls? And vectorizing this loop?

Is this

u += v; // 3D vectors, perhaps
u *= 0.5;
foo(u);

really quicker than this[1]?

foo((u + v)/2);

I think not. I certainly hope not. I want to believe in the compiler! But I don't know if I can.

It's quite rare now that I write C++ code that doesn't use threads in some way. But since C++ doesn't yet have a thread-aware memory model, how can I be sure that fences are being put in the right places? I can't unless I get down and dirty with the instructions.

And take a look at these posts and articles:

If like me, you found them relevant and interesting but only understood around 70% of their content, perhaps you already know where I'm coming from.

The only way to check stuff like this is to look. And if that means reading the generated code, then so be it. Maybe then I can finally feel like I'm doing the language justice.

Footnotes
  1. this is a real example from a library used at work, where the author deliberately omitted functionality such as operator+ because of the belief that forcing users to use opearator+= instead would produce faster code []

Comments

Brendan Miller

[21/01/2010 at 09:16:22]

Yeahh, I've been feeling the same way recently. I'm not sure this kind of micro optimization stuff is that important from a performance perspective... but I honestly think that low level programming is just fun, and provides a lot of interesting puzzles.

Sometimes it's just fun to see how language constructs are implemented. Just run gcc or g++ with the -save-temps flag.

Brian

[21/01/2010 at 23:44:54]

Since I have learned x86, I like taking a look at how the compiler does it. While I am far from an optimisation expert, you can usually see if the compiler has done a decent job. I wouldn't be able to do advanced stuff like vectorisation without learning quite a bit more.

Edd

[22/01/2010 at 09:39:58]

Brian: that's exactly the kind of knowledge I'm aiming for. I don't anticipate writing a lot of assembly, if any at all, but it would be nice to know if my code is as transparent to the compiler as I would like it to be.

Mateusz Loskot

[09/02/2010 at 11:24:44]

A couple of months ago I started learning assembly to be able to read and understand generated code. What I did was reading Duntemann's book (www.duntemann.com) as well as Windows-specific books about debugging like John Robbins' I'm not an expert in assembly and I doubt I could write a software in it, but I confirm your opinion guys. The knowledge of how things work helps while writing in C or C++.

(optional)
(optional)
(required, hint)

Links can be added like [this one -> http://www.mr-edd.co.uk], to my homepage.
Phrases and blocks of code can be enclosed in {{{triple braces}}}.
Any HTML markup will be escaped.