New year's resolution: learn x86 assembly
[20th January 2010]
One reason to use C++ — perhaps the only reason — is performance. I'd certainly rather be using something else if I wasn't writing… um… attempting to write high performance code.
It is evident that C++ authors often feel the need to bend over backwards to write code that they hope will perform well. We aim to avoid copying data about as much as possible. We aim to avoid dynamic allocation. We inline small functions and
outline big ones.
If we're going through the trouble of using a language with as many quirks and corner cases as C++ for the sake of performance, we might as well try to get the most out of it, right?
When examples of convoluted hand-optimization land on public forums, cries such as
let the compiler do that are often heard. They come from my mouth, sometimes.
But do I and others like me really know what we're talking about? Does the compiler really know best all the time? I know that return-value-optimization can be safely applied to my class, but does the compiler? Can it really deduce that it's ok to replace
std::copy with a call to
memcpy in this situation? How about removing those virtual function calls? And vectorizing this loop?
u += v; // 3D vectors, perhaps u *= 0.5; foo(u);
really quicker than this?
foo((u + v)/2);
I think not. I certainly hope not. I want to believe in the compiler! But I don't know if I can.
It's quite rare now that I write C++ code that doesn't use threads in some way. But since C++ doesn't yet have a thread-aware memory model, how can I be sure that fences are being put in the right places? I can't unless I get down and dirty with the instructions.
And take a look at these posts and articles:
- Timing square root
- Source code optimization (PDF) by Felix Von Leitner
- The Ubiquitous SSE vector class: Debunking a common myth
If like me, you found them relevant and interesting but only understood around 70% of their content, perhaps you already know where I'm coming from.
The only way to check stuff like this is to look. And if that means reading the generated code, then so be it. Maybe then I can finally feel like I'm doing the language justice.
- this is a real example from a library used at work, where the author deliberately omitted functionality such as
operator+because of the belief that forcing users to use
opearator+=instead would produce faster code [↵]
All original content copyright© Edd Dawson.
Any opinions expressed by Edd are his own and are not necessarily shared by his employer. Or by anyone else, in fact.
All source code appearing on this website that was written by Edd Dawson is made available under the terms of the Boost software license version 1.0 unless otherwise stated or implied by the license associated with the work from which the code is derived.