Timothy Lochner's Blog

Saturday, August 18, 2018

Mana Engine: Achieving thread safety

I've recently been talking more about the engine I've spent a few years developing. It's been through a lot of revisions, entirely dumping the core of how it works a time or two.

It's most recent iteration has been developed by myself and Rob Brink (@rob_brink on twitter). We both have a bit of experience developing game engines at this point and wanted to really push a core-neutral, thread-safe, easy to use engine. As Rob has often said (in some variation or another) "It needs to be easier to use than falling on your face."

There is a great talk in the GDC vault by the tech director of Overwatch, where he goes over the ECS of their engine. At some point he mentioned that some systems can run in parallel, because you can statically know that they won't ever touch the same component data.

We wanted to take this to the next level by letting the engine handle all of that for you.

What we came up with (and Rob did most of the implementation of) was a way to generate a graph of dependent tasks. We would statically enforce access to components by type, so that we could guarantee that the engine knew when a component type was being read from or written to. Using this knowledge, it could then build the graph.

A system ends up looking something like this:

class MySystem : public AuthorizedSystem< const MyComponent1, const MyComponent2, MyComponent3 >
{
    void Update()
    {
        //operate on the data.
    }
}

There's a bit more to it than that. For instance, accessing components requires some special syntax so that the engine can know whether this system is allowed to use the data or not. But ignoring those details for a moment, lets see how this helps us write thread-safe code.

AuthorizedSystem is a special type of system that is templated in a similar way to a tuple. Using some typetraits similar to std::is_const and some other static helpers, we can identify which ones are allows to be written to or read from. The scheduler knows that when two systems don't access the same data, or only read from that data, they can be run in parallel. It also knows that any writers need to go before any readers. There ends up being some possibilities of circular dependencies, and I'll discuss more about resolving those later.

Currently, in the test game we're making to prove out the engine, this is what our dependency graph looks like.

Side bar: If you're not familiar with webgraphviz.com, it's a very worthwhile and easy to use tool.

In reality, this is a pruned version of our dependency graph. We have some other systems that I didn't include in this run because I'm not testing them. Rob wrote a nice Boids system that I didn't want to use in this run because I was testing other things.

And here's a delicious bonus to all this. Want to know what it took to remove the 5 Boids systems? In my main initialization function, I just had to comment out this line:

Boid::RegisterSystems(world);

We also have some auto generated systems that helps do some bookkeeping around components that I've omitted from the graph for the sake of simplicity.

Okay, so this looks neat and all, but why does it matter?

One thing I've noticed over the years as a developer is that some programmers tend to have an affinity for the harder problems, often within a specific domain. We have graphics programmers, engine programmers, gameplay programmers, networking programmers, audio programmers. The list of specializations goes on. Often enough, especially now when multi-threaded programming has yet not been a must-have understanding to write game code, programmers of all sorts of specialties have not needed to know how to deal with multi-threaded problems.

And in the Mana Engine, they don't have to. The engine is built to help you deal with multi-threaded problems and whenever possible, statically assert that you'll never run into race conditions or performance problems because of things like false sharing.

Recently, Monster Hunter World on PC was revealed (unconfirmed by the developer) to spend a quarter of it's processing time just managing thread overhead, and had around 100 or so of them. Lock-free and wait-free programming isn't easy. And even if you've solved it before that doesn't mean you won't mess up if you need to do it again.

Riot's recent engineering blogpost about performance shows a graph where the main thread is waiting on the particle thread before it can continue. It'd be nice if, in cases like this, instead of waiting the main thread could assist on some of the work the particle thread is doing. But then you get into other issues where now particle work is being done on the main thread and locking might get screwed up. Rob and I have both seen our fair share of bugs and deadlocks caused by the pattern of letting threads "assist" with another thread's work.

In the Mana Engine, we don't even have a main thread. We've chosen to solve the problem before it even happens.

In the Mana Engine, there are always the same or fewer threads than the computer's logical cores and the overhead is absolutely minimal. Solved that problem before it even happened.

In future posts, I hope to cover some more details about the Mana Engine, some of the problems we've ran into, and how we've solved them.

In the mean time, if you have any interest in the Mana Engine and it's details, message me on twitter and I'll be happy to share what we've learned.

Wednesday, December 28, 2016

The Data You Don't Care About

If you haven’t read this post, you probably should before continuing. It kinda sets up everything.

In the last post, it was all about the Data You Care About and making sure that your methods are only bothering with such data. This post is about the other data – Data You Don’t Care About. It begs the question – if you don’t care about it, why am I even writing about it?

Data You Don’t Care About is all about bookkeeping. It facilitates doing things to the data you do care about. As in the previous post, the m_Count and m_Space data members of the example array class is this type of data. I really want to push a point here – it’s data that help you do things to the data you care about. As a result, this data is intrinsically tied to functions.

Enter, the Doer classes.

Okay, okay. So there’s a bad stigma around Doer classes. They have long names, they often wrap only a single function, and can actually make your program harder to reason about. Yes, I agree, if done poorly, Doer Classes are all those terrible things. In fact, the very video that started me on these blog posts explicitly says they’re ugly and bad. In the context he’s talking about they are bad – I’ve seen that kind of code first-hand. What I’m proposing is a way to go about it that really… isn’t bad. Because really what we’re doing is associating bookkeeping data with the functions that do the bookkeeping. Classes just provide us a way of controlling access to that bookkeeping.

In the last post, I gave some examples of game systems that might want to respond to changes in data. These systems are doers. They can very often start as just a single function, but when you need to introduce bookkeeping (often for the sake of optimization, but sometimes other reasons) keeping the bookkeeping data with your function is quite useful.

Let me just throw out an example to make it a bit easier to grasp. Let’s say you have a system for rendering 3D objects in your game. You could loop through your Model components and let each one be a draw call, but there’s a better way to do it. You could organize them by mesh, so that you render all of one type of mesh all at once – probably using hardware instancing. Doing this organization will cost us a bit on the CPU, but will save us far more on our draw call count and the GPU.

// The below code depends on the understanding, so here's a quick low-down of some of the objects and methods in the example:
// Declared elsewhere:
// Model: A Component that has data related to a model (ie, mesh, material, etc).
//     ModelId Model::GetModelId(): returns the ModelId.
// Transform: A Component that has data related to position, orientation, and scale.
//     const Matrix44& Transform::GetWorldMatrix(): returns a Matrix44 that represents the transform's world matrix.
// ComponentId: A shared ID for all components that belong to the same entity. Ie, the key that binds them together.
// ModelId: An asset ID for the model.
// Set: A Hash set.
//     void Set::Insert(key): Inserts a key into the set.
//     void Set::Remove(key): Removes a key from the set.
//     size_t Set::Count(): returns the number of keys in the set.
// Map<T_Key, T_Value>: A container of Key-Value-Pairs.
//     T_Value& Map::FindOrCreate(T_Key): finds the value mapped to this key. If it doesn't exist, it creates it.
//     T_Value* Map::Find(T_Key): Finds the value mapped to this key. Returns nullptr if it doesn't exist.
// Array<T>: A generic dynamic array class.
//     void Array::Reserve(size_t count): reserves count spaces in the array. Good for making sure growth happens only once. 
//         Will not shrink the array is count is already less than the Array's space.
//     void Array::Clear(): removes all elements in the array -- does not shrink the array's allocated space.
//     void Array::Push(const T&) pushes a copy of T onto the array.
// Matrix44: A 4x4 matrix.


class RenderingSystem
{
    typedef Set<ComponentId> ModelInstanceSet
    Map<ModelId, ModelInstanceSet> m_ModelInstances

    void OnModelCreated( Model& model )
    {
        //Create a new entry in the ModelInstanceSet if there is an associated transform.

        ComponentId compId = GetComponentId(model);
        ModelInstanceSet& instanceSet = m_ModelInstances.FindOrCreate(model.GetModelId());
        instanceSet.Insert(compId);
    }
    
    void OnModelDestryed( Model& model )
    {
        ComponentId compId = GetComponentId(model);
        ModelInstanceSet* pOldInstanceSet = m_ModelInstances.Find( oldId );
        if(pOldInstanceSet)
        {
            oldInstanceSet->Remove(compId);
        }
    }
    
    void OnModelIdChanged( Model& model, ModelId oldId )
    {
        ComponentId compId = GetComponentId(model);
        ModelInstanceSet* pOldInstanceSet = m_ModelInstances.Find( oldId );
        if(pOldInstanceSet)
        {
            oldInstanceSet->Remove(compId);
        }
        ModelInstanceSet& newInstanceSet = m_ModelInstances.FindOrCreate( model.GetModelId() );
        newInstanceSet.Insert(compId);
    }
    
public:
    //All systems get a few virtual functions like this. Update, FixedUpdate, etc.
    void Draw() override final 
    {
        Array<Matrix44> transforms;
        for(ModelInstanceSet& instanceSet, m_ModelInstances)
        {
            transforms.Reserve(instanceSet.Count());

            for(ComponentId compId, instanceSet)
            {
                Transform* pTransform = GetComponent<Transform>(compId);
                if(pTransform)
                {
                    transforms.Push(pTransform->GetWorldMatrix());
                }
            }
            // And then here to do the rendering part where you bind the model buffer, the instance buffer (the transforms)
            // and make the drawcall using your favorite Graphics API's instance drawing method, such as D3D's DrawInstanced() 
            // method or OpenGL's glDrawArraysInstanced() function.
        }
    }
}

Of course the real advantage to this set up isn’t that a single system can do this. It’s that this system can operate and never had to know about any other system. Dozens of other systems could manipulate the transform data and the RenderingSystem would never know and would never need to know. That’s the beauty. You can add a PlayerControllerSystem, PhysicsSystem, AIControllerSystem, or whatever else to push and pull the objects around and the RenderingSystem doesn’t care.

Moreover, the RenderingSystem can make optimizations that won’t interfere with the other systems. For instance, rebuilding the instance buffer every frame is a bit excessive, and we could change ModelInstanceSet from a typedef to a struct containing the Set and a dirty flag, and the instance buffer. If it’s not dirty, we don’t rebuild it. The dirty flag would need to check for some additional things, like when transforms are created or destroyed, if there is a Model with a matching ComponentId, but that’s all done here, inside this one file.

The last few things I’m going to bring up about how much I like this take on Object Oriented Programming, are the following:

If for any reason this particular rendering system needed to be gutted and replaced with something else. Maybe you’re changing graphics APIs, or the guy who originally put it together was an absolute goof and wrote it horribly, you can safely extract and replace it with whatever you need.
If for any reason you don’t want the rendering system at all (ie, on a server, or a command-line client) then you just don’t instantiate it. The client can still instantiate one, and then all the client and server have to do is keep their component data in sync.

So back to the data you don’t care about – that’s exactly what the ModelInstanceSet is all about. You care about it for bookkeeping that can make it the game perform faster or smarter, but it’s not the actual data (the actual data you care about are the components). It provides modularity that it can be dropped in or taken out easily.

This all gets to the point from the first blog post and the video that spurred me to write it. Object Oriented Programming, as it is currently utilized in all too much of the professional world, really is bad. But I don’t think that it means all OOP is bad, and I hope these two posts provide sufficient example of how OOP can be used well.

Sunday, August 28, 2016

The Data You Care About

I recently watched a video about why Object Oriented Programming is Bad and later tweeted a bit about it and quickly realize twitter wasn't the right platform to adequately describing my thoughts about a particular part of the video.

So here's my thoughts better laid out. Keep in mind, this is only in reference to what he says at 33:30 into the video -- not the entire video.

Basically the point he's making (and that I want to expand on) is that methods and the idea of encapsulation that they support are not always bad. He says that when the method is tightly related to the data of the class, it's appropriate. A common example are ADTs -- Arrays, Lists, HashLists and other generic containers and constructs.

The question I want to elaborate on is "Why do methods work out okay for ADTs but not other classes?" and "How can we use that to inform how we write other stuff?"

Here's the answer up front: Bookkeeping.

If you think about data (literally, the variables you declare) always keep in mind which ones are Data You Care About and which ones are data that do bookkeeping for the Data You Care About. Let's open up an array to see what I mean.

template<typename T>
class Array
{
   T* m_pArray;     // The actual array of data.
   int m_Count;     // How many elements are in the array.
   int m_Space;     // How much space has been allocated for the array.
public:
    // Constructors, accessors, etc.
}

If it's not obviously, only one of the members of this sample Array class are the actual data we care about -- the other two are just bookkeeping. The key thing here is that within the class, the various methods are manipulating m_Space and m_Count in order to keep track of how much memory has been allocated and how much memory has been initialized. If these were publicly exposed, anybody could write to these methods and screw up the classes accounting of data. In reality, if you know the internal structure of the array class, you could do some casting and pointer arithmetic to manipulate these values anyway. But that's not a big deal because something like that obviously looks like a very unsafe thing to do and you have to go out of your way to do so. Whereas array.m_Count = 10; looks like a perfectly normal line of code and you'd have to evaluate the surrounding code before realizing it was bad.

Like I mentioned in my tweets about this, this is more common than just ADTs, and in fact when you start separating data in your head into "real data" and "bookkeeping" you'll design your classes with much more clarity. Take a typical 3D transform class.

class Transform
{
   Vector3 m_Position;                            // Local Position
   Quaternion m_Orientation;                      // Local Orientation
   Vector3 m_Scale;                               // Local Scale
   mutable Matrix4x4 m_WorldMatrix;
   mutable bool m_WorldMatrixDirty = false;
public:
    // Constructors, accessors, etc.
}

Let's assume that m_Position is initialized as a (0,0,0), m_Orientation is an unrotated quaternion, and m_Scale is (1,1,1). The world matrix is initialized as an identity matrix. Maybe Vector3 is padded to 4 floats instead of 3 or any number of better decisions than how this class is laid out. The point is, there is Data You Care About and bookkeeping (AKA overhead).

This example is a bit deceiving because in the end, we probably only care about m_WorldMatrix. The local Position Orientation and Scale (POS) is likely just used so that when this transform's parent changed, we still have our data relative to the parent and can easily reconstruct our world matrix, which is used for rendering, physics, and many other systems. Note that this class isn't currently describing who the parent is -- could be bound by pointer or ID or something. It doesn't matter for the example being shown.

The obvious bookkeeper is m_WorldMatrixDirty. It's especially notable because it's got the mutable keyword. That means I can modify it within a const method. It makes this possible:

const Matrix4x4& Transform::GetWorldMatrix() const
{
    if(m_WorldMatrixDirty)
    {
        //recalculate world matrix.
        m_WorldMatrixDirty = false;
    }
    return m_WorldMatrix;
}

Now, we are only calculating the world matrix when it actually needs to be calculated. But to the outside world, they have no idea we're doing this trick. However, just as importantly, we don't want people directly writing to our local POS, because when the local POS changes it invalidates our world matrix. So we write something like this:

void Transform::SetLocalPosition(const Vector3& newPosition)
{
    m_Position = newPosition;
    m_WorldMatrixDirty = true;
}

This is almost like Event-Driven Programming¹. We'd guarded the access to our data member because we want to make sure we do something when that value changes. You can even imagine delegates and events used to notify other parts of the code when data has changed and they want to react to those changes. Here's some easy examples:

In a game, when something is added or removed from your inventory:
- The UI wants to know so it can update your inventory window.
- The chat/info box wants to know so they can show a message (ie, "Removed Steel Sword").
- If it is a networked or online game, a system that replicates data may want to notify your client that the item was added or removed.
In a game, when your health changes:
- Enemy AI may want to prioritize their targets. If you have low enough HP, maybe they just want to finish you off.
- Friendly AI may want to prioritize their healing or protective abilities.
- The UI will want to show the health change.
- The game may want to make your character grunt from the hit if it was large enough.
- In a networked game, a replication system needs to notify nearby clients of the change.
In a level editor, when you make a change to the heightfield:
- The terrain mesh will need to be rebaked for rendering, collision, pathfinding, etc.
- Placed objects may want to move with the terrain as it is being deformed.
- A terrain texturing system may want to change the texture based on height, slope, or any other number of properties of the new mesh.
- Flora may want to regenerate -- maybe the grass only grows on flat terrain and not hills. It wants to know if you just made a steep hill.

Or the UI reacts to a change in stats.

I could go on and on with examples.

The big takeaway is that none of this automatic bookkeeping could be done without methods (or at least some form of indicating which functions were allowed access to data members). Methods certainly are useful and maybe even more commonly useful than the author of the video is letting on. I'm not saying he's wrong -- just that it's slightly more nuanced than the video describes. And maybe that's just a result of only having so much time in a video to explain things. Only he'd really be able to comment on that.

Next post I'll talk about who might want to be listening to these events and what type of data those objects are likely to have (hint, it's not Data You Care About, and that's okay).

¹ Full disclaimer -- event driven programming has it's faults too -- it certainly shouldn't be used everywhere.