How are entity systems cache-efficient?

Question

Lately I've been doing a lot of reading on entity systems to implement in my game engine (C++/OpenGL). The two key benefits that I constantly hear lauded about entity systems are 1) the easy construction of new kinds of entities due to not having to tangle with complex inheritance hierarchies, and 2) cache efficiency. I get #1, it's #2 I'm having trouble understanding.

The theory is simple of course, each component is stored contiguously in a block of memory so the system that cares about that component can just iterate over the whole list without having to jump around in memory and kill the cache. Problem is, I can't really think of a situation where this is actually practical to do.

First, let's look at how the components are stored and how they reference each other. Systems need to be able to work with more than one component, ie both the rendering and physics system need to access the transform component. I've seen a number of possible implementations that address this, and none of them do it well.

You can have components store pointers to other components, or pointers to entities that store pointers to components. Well, as soon as you throw pointers into the mix, you're already killing cache efficiency. You can ensure that every component array is 'n' large, where 'n' is the number of entities alive in the system, but this approach is hideously wasteful of memory, makes it very difficult to add new component types to the engine, but still throws away cache efficiency because you're jumping from one array to the next. You could interleave your entity array instead of keeping separate arrays, but you're still wasting memory, making it prohibitively expensive to add new components or systems, but now with the added benefit of invalidating all of your old levels and save files.

Secondly, this is all assuming that entities are processed linearly in a list every frame/tick, but in reality this is not often the case. Say you use a sector/portal renderer or an octree to perform occlusion culling. You might be able to store entities contiguously within a sector/node, but you're going to be jumping around whether you like it or not. Then you have other systems, which might prefer entities stored in some other order. AI might be okay with storing entities in a big ol' list, until you start working with AI LOD, then you'll want to split that list up according to distance to the player, or some other LOD metric. Physics is going to want to use that octree. Scripts don't care, they need to run no matter what.

I could see splitting components up between "logic" (ai, scripts, etc) and "world" (rendering, physics, audio, etc), and managing each list separately, but these lists still have to interact with each other. AI is pointless if it can't affect the transform or animation state used for an entity's rendering.

My question, ultimately, is thus: How are entity systems "cache-efficient" in a real-world game engine? Perhaps there's a hybrid approach that everyone's using but nobody's talking about, like storing entities in an array globally and referencing it within the octree?

Please note nowadays you have multi-core CPU and cache bigger then one row. Even if you need access information from two systems, they are likely to fit in both. Also note, graphic rendering is often separated - exactly for what you stated (trees, scenes, ..) — wondra, Aug 16 at 21:54
Entity systems are not always cache-efficient, but it can be an advantage of some implementations (over other ways to accomplish similar things). — Josh Petrie♦, Aug 17 at 1:52

Sean Middleditch · Answer 1 · 2014-08-17 02:30:20Z

The two key benefits that I constantly hear lauded about entity systems are 1) the easy construction of new kinds of entities due to not having to tangle with complex inheritance hierarchies, and 2) cache efficiency.

Note that (1) is a benefit of component-based design, not just ES/ECS. You can use components in many ways that do not have the "systems" part and they work just fine (and plenty of both indie and AAA games use such architectures).

Systems need to be able to work with more than one component, ie both the rendering and physics system need to access the transform component.

Most ECS sort their component containers by Entity ID, meaning that the corresponding components in each group will be in the same order.

This means that if you're linearly iterating over graphics component you're also linearly iterating over the corresponding transform components. You might be skipping some of the transforms (since you may have physics trigger volumes that you don't render or such) but since you're always skipping forward in memory (and by not particularly huge distances, usually) you're still going to have efficiency gains.

This is similar to how you have the Structure Of Arrays (SOA) being the recommended approach for HPC. The CPU and cache can deal with multiple linear arrays almost as well as it can deal with a single linear array, and far better than it can deal with random memory access.

You can have components store pointers to other components, or pointers to entities that store pointers to components.

Somewhat unrelated, but... don't do that. Use handles, not pointers.

You can ensure that every component array is 'n' large, where 'n' is the number of entities alive in the system

Since you use handles and not pointers, you can resize the arrays without fear of breaking object references.

You can also use a "chunked array" approach (an array of arrays) similar to many common std::deque implementation (though without the pitifully-small chunk size of said implementations) if you want to allow pointers for some reason or if you have measured problems with array resize performance.

Secondly, this is all assuming that entities are processed linearly in a list every frame/tick, but in reality this is not often the case

It depends on the entity. Yes, for many use cases, it's not true. Indeed, this is why I so strongly stress the difference between component-based design (good) and entity-system (a specific form of CBD).

Some of your components will certainly be easy to process linearly. Even in normally "tree heavy" use cases we have definitely seen performance increases from using tightly packed arrays (mostly in cases involving an N of a few hundred at most, like AI agents in a typical game).

Say you use a sector/portal renderer or an octree to perform occlusion culling. You might be able to store entities contiguously within a sector/node, but you're going to be jumping around whether you like it or not.

You'd be surprised how much the array still helps. You're jumping around in a much smaller region of memory than "anywhere" and even with all the jumping you're still much more likely to end up in something in cache. With a tree of a certain size or less, you might even be able to prefetch the whole thing into cache and not ever have a cache miss on that tree.

There are also tree structures that are built to live in tightly packed arrays. For instance, with your octree, you can use a heap-like structure (parents before children, siblings next to each other) and ensure that even when you "drill down" the tree you're always iterating forward in the array, which helps the CPU optimize the memory accesses / cache lookups.

Which is an importnat point to make. An x86 CPU is a complex beast. The CPU is running an optimizer at runtime on your machine code, breaking it into smaller microcode and reordering instructions, predicting memory access patterns, etc. Data access patterns matter more than may be readily apparent if all you have is a high-level understanding of how the CPU or cache work.

Then you have other systems, which might prefer entities stored in some other order.

You could store them multiple times. Once you strip down your arrays to the bare minimum details, you might find you actually save memory (since you've removed your 64-bit pointers and other nonsense and can use small 16-bit indices or so on) with this approach.

You could interleave your entity array instead of keeping separate arrays, but you're still wasting memory

This is antithetical to good cache usage. If all you care about are the transforms and graphics data, why make the machine spend time pulling in all that other crap for physics and AI and input and debug and so on?

That's the point usually made in favor of ECS vs monolithic game objects (though not really applicable when comparing to other component-based architectures).

AI is pointless if it can't affect the transform or animation state used for an entity's rendering.

Just because AI can't efficiently access transform data linearly doesn't mean that no other system can use that data layout optimization effectively. You can use a packed array for transform data without stopping game logic systems from doing things the ad hoc way game logic systems usually do things.

You're also forgetting code cache. When you use the systems approach of ECS (unlike some more naive component architecture) you're guaranteeing that you're running the same small loop of code and not jumping back and forth through virtual function tables to an assortment of random Update functions strewn all over your binary. So in the AI case, you really want to keep all your different AI components (because certainly you have more than one so that you can compose behaviors!) in separate buckets and process each list separately in order to get the best code cache usage.

With a delayed event queue (where a system generates a list of events but doesn't dispatch them until the system finishes processing all entities) you can ensure that your code cache is used well while keeping events.

Using an approach where each system knows which event queues to read out of for the frame, you can even make reading events fast. Or faster than without, at least.

Remember, performance isn't absolute. You don't need to eliminate every last single cache miss to start seeing the performance benefits of good data-oriented design.

Or perhaps there's a hybrid approach that everyone's using but nobody's talking about

This is what advocate: use good data-oriented approaches to components where the performance is critical. Use simpler architecture where simplicity improves development time. Don't shoe-horn every single component into a strict over-definition of componentization like ECS proposes. Develop your component architecture in such a way that you can easily use ECS-like approaches where they make sense and use simpler component structure where ECS-like approach don't make sense (or make less sense than a tree structure, or so on).

You might note that a lot of people talk about ECS without really understanding what it even is. I see Unity referred to as an ECS with depressing frequency, illustrating that too many game developers equate "ECS" with "Components" and pretty much ignore the "Entity System" part entirely. You see a lot of love heaped on ECS on the Internet when a large portion of the people are really just advocating component-based design, not an actual ECS. At this point it's almost pointless to argue it; ECS has been corrupted from its original meaning into a generic term and you might as well accept that "ECS" doesn't mean the same thing as "data-oriented ECS". :/

It would be useful to define (or link to) what you mean by ECS, if you're going to compare/contrast it with general component-based design. I for one am not clear on what the distinction is. :) — Nathan Reed, Aug 17 at 3:39
Thanks a lot for the answer, it looks like I still have a lot of research to do on the subject. Are there any books you could point me to? — Haydn V. Harach, Aug 18 at 4:46
@NathanReed: ECS is documented in places like entity-systems.wikidot.com/es-terminology . Component-based design is just regular ol' aggregation-over-inheritance but with the focus on the dynamic composition useful for game design. You can write component-based engines that don't use Systems or Entities (in ECS terminology's meaning) and you can use components for far more in a game engine than just game-objects/entities, which is why I stress the difference. — Sean Middleditch, Aug 18 at 9:51

asked	2 months ago
viewed	448 times
active	2 months ago

current community

your communities

more stack exchange communities

How are entity systems cache-efficient?

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged c++ engine entity-system optimization or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

How are entity systems cache-efficient?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged c++ engine entity-system optimization or ask your own question.

Linked

Related

Hot Network Questions