When I started in the games industry back in 2000, one of my first major tasks (with the help of one other eager masochist) was to rewrite that company’s graphics engine for the PS2 (while a game was being written using it – a recipe for pain if I’ve even seen one). I designed and built it using the knowledge I’d attained from working in the Mining and Defence industries as well as what I’d learnt in academia. It was basically an object oriented engine – you had renderable objects that contained a lot of information about themselves; their state, size, orientation and position, references to vertex and texture data etc. These renderable objects were stored in a fairly flat hierarchy and DMA chains were constructed from the visible objects and used for rendering on the PS2. It was a simple engine (I don’t believe in overcomplicating things) and it did its job well enough, but over the years it increased in functionality and the performance demanded of it increased also.
The obvious bottlenecks were optimised and eventually what we were left with was an engine that, while functional enough, just didn’t run fast enough. Profiling at this point showed no tightly contained bottlenecks, it was as if a miasma of inefficiency was spread throughout the rendering system – everything was just a little bit slow. I bit the bullet and completely rewrote it, pulling out the object oriented sensibilities and replacing everything with flat homogenous arrays – the static world was still rendered in parcels, but each parcel was a lean set of arrays of bounding boxes, DMA chains and data already prepared sending directly to the HW. With data neatly laid out in such a fashion performance leapt by an order of magnitude – a test scene I was was profiling with went from taking 17ms to render to 2ms. Loading of the data from disc also sped up dramatically. Note that this was an engine that had been optimised and improved over 5 years.
I had to learn a lot to extract this level of performance – I had to understand how the I and D caches worked, how the compiler transformed code, and how the data flowed through the hardware and software I was using and writing. I also learnt that in order to gain a high level of performance you will probably have to throw away your OO design and replace it with a design that considers data and the flow of that data as a primary concern.
This is even more evident in today’s machines – it can cost up to 600 cycles to extract a piece of data from outside of the L2 cache on a Power PC processor! Do you have any idea how much processing you can do in 600 cycles? In order to extract a high level of performance, a programmer *must* consider the data over the processing of that data. If your data is not in cache friendly coherent streams then it doesn’t matter how few cycles your code takes to execute, all that matters is how fast you can get your data to your instructions. Precaching your data helps, but you still have to be able to look 400 cycles or so ahead to ensure that the required data is ready in the cache when you need it.
This isn’t a new problem, but it is one that has been slowly creeping up on us. In the 80’s we had the pleasure of access to main memory being in the order of a single cycle or so – obviously the focus on design in such a system is on the instructions. Do you know what was written in the 80’s? C++ (well, started in ‘79 but first released in ‘85). Since the 80s CPU speeds have been increasing by 60% per year and memory performance has relatively crawled along at a measly 10% increase in performance per year.
What this means is that this problem will only get worse. Adding extra levels of cache will help, better and bigger caches will help, but in the end you still need to get your data from the relatively slow main memory into your pipeline. And if you want your system to perform well, you will need to think very carefully about where the data that you want is, how much there is of it and how long it will take to get it.
The reason that OO design is so bad for modern (console) architectures is that it treats data and code as being equally important. Bundling up all the associated data into a single contiguous chunk may be convenient for debugging and for your traditional OO programming mind set, but it will run badly. You are far better off allocating this data into homogenous pools (avoiding heavy malloc() calls is always a good idea anyway) or at least keeping the data that is used together contiguous (spatial and temporal locality of data is a necessary goal here).
The other benefit of considering data in this manner is that it becomes much easier to parallelise. Your code is generally simpler (it is doing fewer things at once), dependencies are more obvious and functionality more delineated (making it easier to break up into independent tasks). You also know what you will be doing 400 or 500 cycles in the future so prefetching becomes easier too. Not to mention the ease of the migration of this code to SPU (assuming you have them).
There is still a place for object oriented design in games, most definitely. C++ provides some very convenient ways to manage large systems of code, and 80% of your codebase isn’t going to be the bottleneck anyway. Its the 20% that gets executed 80% of the time that you need to worry about. If you aren’t clear on how data will flow through your system, or can’t know how it flows, then by all means build that system in an OO fashion, but be aware that you may (will) have to rewrite this code at a later date. Keep an eye on the data in the classes, be aware of how this data is used, note which data is used the most – and when that system becomes a bottleneck, refactor it so that it works efficiently under the hood. If your design is adequate then you should be able to maintain a similar interface and protect the rest of the game code from too much disruption. But, in order to make things easier on yourself, you should be considering the design of your data over the design of your code and you should be doing it now.
Some of the game development industry’s top programmers have been talking (and in at least one case, ranting) about this for years. Christer Ericson talked about it in his GDC 2003 presentation. Mike Acton persistently proclaims that C++ programming is Bullshit and his Three Big Lies are fundamentally about designing around data instead of code. Recently, Noel Llopis published an excellent article on Data-Oriented Design in the September issue of Game Developer magazine. ![]()
Memory access speeds have been the elephant in the room for years now, but now either the elephant is getting bigger or the room is getting smaller. Either way, we can’t afford to ignore it anymore.




