Wednesday, December 28, 2016

The Data You Don't Care About

If you haven’t read this post, you probably should before continuing. It kinda sets up everything.

In the last post, it was all about the Data You Care About and making sure that your methods are only bothering with such data. This post is about the other data – Data You Don’t Care About. It begs the question – if you don’t care about it, why am I even writing about it?

Data You Don’t Care About is all about bookkeeping. It facilitates doing things to the data you do care about. As in the previous post, the m_Count and m_Space data members of the example array class is this type of data. I really want to push a point here – it’s data that help you do things to the data you care about. As a result, this data is intrinsically tied to functions.

Enter, the Doer classes.

Okay, okay. So there’s a bad stigma around Doer classes. They have long names, they often wrap only a single function, and can actually make your program harder to reason about. Yes, I agree, if done poorly, Doer Classes are all those terrible things. In fact, the very video that started me on these blog posts explicitly says they’re ugly and bad. In the context he’s talking about they are bad – I’ve seen that kind of code first-hand. What I’m proposing is a way to go about it that really… isn’t bad. Because really what we’re doing is associating bookkeeping data with the functions that do the bookkeeping. Classes just provide us a way of controlling access to that bookkeeping.

In the last post, I gave some examples of game systems that might want to respond to changes in data. These systems are doers. They can very often start as just a single function, but when you need to introduce bookkeeping (often for the sake of optimization, but sometimes other reasons) keeping the bookkeeping data with your function is quite useful.

Let me just throw out an example to make it a bit easier to grasp. Let’s say you have a system for rendering 3D objects in your game. You could loop through your Model components and let each one be a draw call, but there’s a better way to do it. You could organize them by mesh, so that you render all of one type of mesh all at once – probably using hardware instancing. Doing this organization will cost us a bit on the CPU, but will save us far more on our draw call count and the GPU.

// The below code depends on the understanding, so here's a quick low-down of some of the objects and methods in the example:
// Declared elsewhere:
// Model: A Component that has data related to a model (ie, mesh, material, etc).
//     ModelId Model::GetModelId(): returns the ModelId.
// Transform: A Component that has data related to position, orientation, and scale.
//     const Matrix44& Transform::GetWorldMatrix(): returns a Matrix44 that represents the transform's world matrix.
// ComponentId: A shared ID for all components that belong to the same entity. Ie, the key that binds them together.
// ModelId: An asset ID for the model.
// Set: A Hash set.
//     void Set::Insert(key): Inserts a key into the set.
//     void Set::Remove(key): Removes a key from the set.
//     size_t Set::Count(): returns the number of keys in the set.
// Map<T_Key, T_Value>: A container of Key-Value-Pairs.
//     T_Value& Map::FindOrCreate(T_Key): finds the value mapped to this key. If it doesn't exist, it creates it.
//     T_Value* Map::Find(T_Key): Finds the value mapped to this key. Returns nullptr if it doesn't exist.
// Array<T>: A generic dynamic array class.
//     void Array::Reserve(size_t count): reserves count spaces in the array. Good for making sure growth happens only once. 
//         Will not shrink the array is count is already less than the Array's space.
//     void Array::Clear(): removes all elements in the array -- does not shrink the array's allocated space.
//     void Array::Push(const T&) pushes a copy of T onto the array.
// Matrix44: A 4x4 matrix.


class RenderingSystem
{
    typedef Set<ComponentId> ModelInstanceSet
    Map<ModelId, ModelInstanceSet> m_ModelInstances

    void OnModelCreated( Model& model )
    {
        //Create a new entry in the ModelInstanceSet if there is an associated transform.

        ComponentId compId = GetComponentId(model);
        ModelInstanceSet& instanceSet = m_ModelInstances.FindOrCreate(model.GetModelId());
        instanceSet.Insert(compId);
    }
    
    void OnModelDestryed( Model& model )
    {
        ComponentId compId = GetComponentId(model);
        ModelInstanceSet* pOldInstanceSet = m_ModelInstances.Find( oldId );
        if(pOldInstanceSet)
        {
            oldInstanceSet->Remove(compId);
        }
    }
    
    void OnModelIdChanged( Model& model, ModelId oldId )
    {
        ComponentId compId = GetComponentId(model);
        ModelInstanceSet* pOldInstanceSet = m_ModelInstances.Find( oldId );
        if(pOldInstanceSet)
        {
            oldInstanceSet->Remove(compId);
        }
        ModelInstanceSet& newInstanceSet = m_ModelInstances.FindOrCreate( model.GetModelId() );
        newInstanceSet.Insert(compId);
    }
    
public:
    //All systems get a few virtual functions like this. Update, FixedUpdate, etc.
    void Draw() override final 
    {
        Array<Matrix44> transforms;
        for(ModelInstanceSet& instanceSet, m_ModelInstances)
        {
            transforms.Reserve(instanceSet.Count());

            for(ComponentId compId, instanceSet)
            {
                Transform* pTransform = GetComponent<Transform>(compId);
                if(pTransform)
                {
                    transforms.Push(pTransform->GetWorldMatrix());
                }
            }
            // And then here to do the rendering part where you bind the model buffer, the instance buffer (the transforms)
            // and make the drawcall using your favorite Graphics API's instance drawing method, such as D3D's DrawInstanced() 
            // method or OpenGL's glDrawArraysInstanced() function.
        }
    }
}

Of course the real advantage to this set up isn’t that a single system can do this. It’s that this system can operate and never had to know about any other system. Dozens of other systems could manipulate the transform data and the RenderingSystem would never know and would never need to know. That’s the beauty. You can add a PlayerControllerSystem, PhysicsSystem, AIControllerSystem, or whatever else to push and pull the objects around and the RenderingSystem doesn’t care.

Moreover, the RenderingSystem can make optimizations that won’t interfere with the other systems. For instance, rebuilding the instance buffer every frame is a bit excessive, and we could change ModelInstanceSet from a typedef to a struct containing the Set and a dirty flag, and the instance buffer. If it’s not dirty, we don’t rebuild it. The dirty flag would need to check for some additional things, like when transforms are created or destroyed, if there is a Model with a matching ComponentId, but that’s all done here, inside this one file.

The last few things I’m going to bring up about how much I like this take on Object Oriented Programming, are the following:

  1. If for any reason this particular rendering system needed to be gutted and replaced with something else. Maybe you’re changing graphics APIs, or the guy who originally put it together was an absolute goof and wrote it horribly, you can safely extract and replace it with whatever you need.
  2. If for any reason you don’t want the rendering system at all (ie, on a server, or a command-line client) then you just don’t instantiate it. The client can still instantiate one, and then all the client and server have to do is keep their component data in sync.

So back to the data you don’t care about – that’s exactly what the ModelInstanceSet is all about. You care about it for bookkeeping that can make it the game perform faster or smarter, but it’s not the actual data (the actual data you care about are the components). It provides modularity that it can be dropped in or taken out easily.

This all gets to the point from the first blog post and the video that spurred me to write it. Object Oriented Programming, as it is currently utilized in all too much of the professional world, really is bad. But I don’t think that it means all OOP is bad, and I hope these two posts provide sufficient example of how OOP can be used well.