Wednesday, December 28, 2016

The Data You Don't Care About

If you haven’t read this post, you probably should before continuing. It kinda sets up everything.

In the last post, it was all about the Data You Care About and making sure that your methods are only bothering with such data. This post is about the other data – Data You Don’t Care About. It begs the question – if you don’t care about it, why am I even writing about it?

Data You Don’t Care About is all about bookkeeping. It facilitates doing things to the data you do care about. As in the previous post, the m_Count and m_Space data members of the example array class is this type of data. I really want to push a point here – it’s data that help you do things to the data you care about. As a result, this data is intrinsically tied to functions.

Enter, the Doer classes.

Okay, okay. So there’s a bad stigma around Doer classes. They have long names, they often wrap only a single function, and can actually make your program harder to reason about. Yes, I agree, if done poorly, Doer Classes are all those terrible things. In fact, the very video that started me on these blog posts explicitly says they’re ugly and bad. In the context he’s talking about they are bad – I’ve seen that kind of code first-hand. What I’m proposing is a way to go about it that really… isn’t bad. Because really what we’re doing is associating bookkeeping data with the functions that do the bookkeeping. Classes just provide us a way of controlling access to that bookkeeping.

In the last post, I gave some examples of game systems that might want to respond to changes in data. These systems are doers. They can very often start as just a single function, but when you need to introduce bookkeeping (often for the sake of optimization, but sometimes other reasons) keeping the bookkeeping data with your function is quite useful.

Let me just throw out an example to make it a bit easier to grasp. Let’s say you have a system for rendering 3D objects in your game. You could loop through your Model components and let each one be a draw call, but there’s a better way to do it. You could organize them by mesh, so that you render all of one type of mesh all at once – probably using hardware instancing. Doing this organization will cost us a bit on the CPU, but will save us far more on our draw call count and the GPU.

// The below code depends on the understanding, so here's a quick low-down of some of the objects and methods in the example:
// Declared elsewhere:
// Model: A Component that has data related to a model (ie, mesh, material, etc).
//     ModelId Model::GetModelId(): returns the ModelId.
// Transform: A Component that has data related to position, orientation, and scale.
//     const Matrix44& Transform::GetWorldMatrix(): returns a Matrix44 that represents the transform's world matrix.
// ComponentId: A shared ID for all components that belong to the same entity. Ie, the key that binds them together.
// ModelId: An asset ID for the model.
// Set: A Hash set.
//     void Set::Insert(key): Inserts a key into the set.
//     void Set::Remove(key): Removes a key from the set.
//     size_t Set::Count(): returns the number of keys in the set.
// Map<T_Key, T_Value>: A container of Key-Value-Pairs.
//     T_Value& Map::FindOrCreate(T_Key): finds the value mapped to this key. If it doesn't exist, it creates it.
//     T_Value* Map::Find(T_Key): Finds the value mapped to this key. Returns nullptr if it doesn't exist.
// Array<T>: A generic dynamic array class.
//     void Array::Reserve(size_t count): reserves count spaces in the array. Good for making sure growth happens only once. 
//         Will not shrink the array is count is already less than the Array's space.
//     void Array::Clear(): removes all elements in the array -- does not shrink the array's allocated space.
//     void Array::Push(const T&) pushes a copy of T onto the array.
// Matrix44: A 4x4 matrix.

class RenderingSystem
    typedef Set<ComponentId> ModelInstanceSet
    Map<ModelId, ModelInstanceSet> m_ModelInstances

    void OnModelCreated( Model& model )
        //Create a new entry in the ModelInstanceSet if there is an associated transform.

        ComponentId compId = GetComponentId(model);
        ModelInstanceSet& instanceSet = m_ModelInstances.FindOrCreate(model.GetModelId());
    void OnModelDestryed( Model& model )
        ComponentId compId = GetComponentId(model);
        ModelInstanceSet* pOldInstanceSet = m_ModelInstances.Find( oldId );
    void OnModelIdChanged( Model& model, ModelId oldId )
        ComponentId compId = GetComponentId(model);
        ModelInstanceSet* pOldInstanceSet = m_ModelInstances.Find( oldId );
        ModelInstanceSet& newInstanceSet = m_ModelInstances.FindOrCreate( model.GetModelId() );
    //All systems get a few virtual functions like this. Update, FixedUpdate, etc.
    void Draw() override final 
        Array<Matrix44> transforms;
        for(ModelInstanceSet& instanceSet, m_ModelInstances)

            for(ComponentId compId, instanceSet)
                Transform* pTransform = GetComponent<Transform>(compId);
            // And then here to do the rendering part where you bind the model buffer, the instance buffer (the transforms)
            // and make the drawcall using your favorite Graphics API's instance drawing method, such as D3D's DrawInstanced() 
            // method or OpenGL's glDrawArraysInstanced() function.

Of course the real advantage to this set up isn’t that a single system can do this. It’s that this system can operate and never had to know about any other system. Dozens of other systems could manipulate the transform data and the RenderingSystem would never know and would never need to know. That’s the beauty. You can add a PlayerControllerSystem, PhysicsSystem, AIControllerSystem, or whatever else to push and pull the objects around and the RenderingSystem doesn’t care.

Moreover, the RenderingSystem can make optimizations that won’t interfere with the other systems. For instance, rebuilding the instance buffer every frame is a bit excessive, and we could change ModelInstanceSet from a typedef to a struct containing the Set and a dirty flag, and the instance buffer. If it’s not dirty, we don’t rebuild it. The dirty flag would need to check for some additional things, like when transforms are created or destroyed, if there is a Model with a matching ComponentId, but that’s all done here, inside this one file.

The last few things I’m going to bring up about how much I like this take on Object Oriented Programming, are the following:

  1. If for any reason this particular rendering system needed to be gutted and replaced with something else. Maybe you’re changing graphics APIs, or the guy who originally put it together was an absolute goof and wrote it horribly, you can safely extract and replace it with whatever you need.
  2. If for any reason you don’t want the rendering system at all (ie, on a server, or a command-line client) then you just don’t instantiate it. The client can still instantiate one, and then all the client and server have to do is keep their component data in sync.

So back to the data you don’t care about – that’s exactly what the ModelInstanceSet is all about. You care about it for bookkeeping that can make it the game perform faster or smarter, but it’s not the actual data (the actual data you care about are the components). It provides modularity that it can be dropped in or taken out easily.

This all gets to the point from the first blog post and the video that spurred me to write it. Object Oriented Programming, as it is currently utilized in all too much of the professional world, really is bad. But I don’t think that it means all OOP is bad, and I hope these two posts provide sufficient example of how OOP can be used well.

Sunday, August 28, 2016

The Data You Care About

I recently watched a video about why Object Oriented Programming is Bad and later tweeted a bit about it and quickly realize twitter wasn't the right platform to adequately describing my thoughts about a particular part of the video.

So here's my thoughts better laid out. Keep in mind, this is only in reference to what he says at 33:30 into the video -- not the entire video.

Basically the point he's making (and that I want to expand on) is that methods and the idea of encapsulation that they support are not always bad. He says that when the method is tightly related to the data of the class, it's appropriate. A common example are ADTs -- Arrays, Lists, HashLists and other generic containers and constructs.

The question I want to elaborate on is "Why do methods work out okay for ADTs but not other classes?" and "How can we use that to inform how we write other stuff?"

Here's the answer up front: Bookkeeping.

If you think about data (literally, the variables you declare) always keep in mind which ones are Data You Care About and which ones are data that do bookkeeping for the Data You Care About. Let's open up an array to see what I mean.

template<typename T>
class Array
   T* m_pArray;     // The actual array of data.
   int m_Count;     // How many elements are in the array.
   int m_Space;     // How much space has been allocated for the array.
    // Constructors, accessors, etc.

If it's not obviously, only one of the members of this sample Array class are the actual data we care about -- the other two are just bookkeeping. The key thing here is that within the class, the various methods are manipulating m_Space and m_Count in order to keep track of how much memory has been allocated and how much memory has been initialized. If these were publicly exposed, anybody could write to these methods and screw up the classes accounting of data. In reality, if you know the internal structure of the array class, you could do some casting and pointer arithmetic to manipulate these values anyway. But that's not a big deal because something like that obviously looks like a very unsafe thing to do and you have to go out of your way to do so. Whereas array.m_Count = 10; looks like a perfectly normal line of code and you'd have to evaluate the surrounding code before realizing it was bad.

Like I mentioned in my tweets about this, this is more common than just ADTs, and in fact when you start separating data in your head into "real data" and "bookkeeping" you'll design your classes with much more clarity. Take a typical 3D transform class.

class Transform
   Vector3 m_Position;                            // Local Position
   Quaternion m_Orientation;                      // Local Orientation
   Vector3 m_Scale;                               // Local Scale
   mutable Matrix4x4 m_WorldMatrix;
   mutable bool m_WorldMatrixDirty = false;
    // Constructors, accessors, etc.

Let's assume that m_Position is initialized as a (0,0,0), m_Orientation is an unrotated quaternion, and m_Scale is (1,1,1). The world matrix is initialized as an identity matrix. Maybe Vector3 is padded to 4 floats instead of 3 or any number of better decisions than how this class is laid out. The point is, there is Data You Care About and bookkeeping (AKA overhead).

This example is a bit deceiving because in the end, we probably only care about m_WorldMatrix. The local Position Orientation and Scale (POS) is likely just used so that when this transform's parent changed, we still have our data relative to the parent and can easily reconstruct our world matrix, which is used for rendering, physics, and many other systems. Note that this class isn't currently describing who the parent is -- could be bound by pointer or ID or something. It doesn't matter for the example being shown.

The obvious bookkeeper is m_WorldMatrixDirty. It's especially notable because it's got the mutable keyword. That means I can modify it within a const method. It makes this possible:

const Matrix4x4& Transform::GetWorldMatrix() const
        //recalculate world matrix.
        m_WorldMatrixDirty = false;
    return m_WorldMatrix;

Now, we are only calculating the world matrix when it actually needs to be calculated. But to the outside world, they have no idea we're doing this trick. However, just as importantly, we don't want people directly writing to our local POS, because when the local POS changes it invalidates our world matrix. So we write something like this:

void Transform::SetLocalPosition(const Vector3& newPosition)
    m_Position = newPosition;
    m_WorldMatrixDirty = true;

This is almost like Event-Driven Programming1. We'd guarded the access to our data member because we want to make sure we do something when that value changes. You can even imagine delegates and events used to notify other parts of the code when data has changed and they want to react to those changes. Here's some easy examples:

  • In a game, when something is added or removed from your inventory:
    • The UI wants to know so it can update your inventory window.
    • The chat/info box wants to know so they can show a message (ie, "Removed Steel Sword").
    • If it is a networked or online game, a system that replicates data may want to notify your client that the item was added or removed.
  • In a game, when your health changes:
    • Enemy AI may want to prioritize their targets. If you have low enough HP, maybe they just want to finish you off.
    • Friendly AI may want to prioritize their healing or protective abilities.
    • The UI will want to show the health change.
    • The game may want to make your character grunt from the hit if it was large enough.
    • In a networked game, a replication system needs to notify nearby clients of the change.
  • In a level editor, when you make a change to the heightfield:
    • The terrain mesh will need to be rebaked for rendering, collision, pathfinding, etc.
    • Placed objects may want to move with the terrain as it is being deformed.
    • A terrain texturing system may want to change the texture based on height, slope, or any other number of properties of the new mesh.
    • Flora may want to regenerate -- maybe the grass only grows on flat terrain and not hills. It wants to know if you just made a steep hill.

Or the UI reacts to a change in stats.

I could go on and on with examples.

The big takeaway is that none of this automatic bookkeeping could be done without methods (or at least some form of indicating which functions were allowed access to data members). Methods certainly are useful and maybe even more commonly useful than the author of the video is letting on. I'm not saying he's wrong -- just that it's slightly more nuanced than the video describes. And maybe that's just a result of only having so much time in a video to explain things. Only he'd really be able to comment on that.

Next post I'll talk about who might want to be listening to these events and what type of data those objects are likely to have (hint, it's not Data You Care About, and that's okay).

1 Full disclaimer -- event driven programming has it's faults too -- it certainly shouldn't be used everywhere.

Friday, January 29, 2016

"Me Too" MMOs

I was reading through an excerpt of a speech from Jeff Strain. I've read the speech many times before throughout my career -- it was given in 2007 -- but reading it nearly a decade later I found myself unsure of the words being said.

Here's the excerpt:

"Before you start building the ultimate MMO, you should accept that “MMO” is a technology, not a game design. It still feels like many MMOs are trying to build on the fundamental designs established by UO and EQ in the late ’90s. In the heyday of Doom and Quake we all eventually realized that “3D” was a technology, distinct from the “FPS,” which was a game design. It’s time we accepted that for MMOs as well. We are finding ways to overcome many of the limitations of the technology that dictated the early MMO design, such as Internet latency and limited global scalability. These improvements can enable a new class of online games that break out of the traditional MMO mold and explore new territory. It can be a daunting proposition to willfully walk away from what seems to be a “sure thing” in game design, but lack of differentiation is probably the number one reason that MMOs fail, so we all need to leave the comfort zone and start innovating, or risk creating yet another “me too” MMO."

The speech is somewhat prophetic at the end, saying "me too" mmos are likely to fail.

What's interesting is that 9 years later we've seen a lot of new types of MMOs. Survival MMOs, the MMORTS, social oriented online games. Even strictly non-combat MMOs like Ever Jane. The list goes on. But at the same time, we've still seen a lot of "Me Too" MMOS. Many have failed, but a fair number of them have succeeded. Perhaps most strangely is the fact that Guild Wars 2, the successor to Guild Wars and made by the company co-founded by Jeff Strain is very much a "Me Too" MMO that is succeeding. When Guild Wars 2 was first presented to the Guild Wars community, it was even pitched as "an MMO more like what you think a typical MMO is like" (paraphrasing here). The most notable difference is the persistent explorable zones. And while the dynamic events are an attempt at innovation, it's more of an evolution (Quests -> Group Quest -> Warhammer Online's Public Quests -> Guild Wars 2's Dynamic Events). Hearts are especially quest-like in their lack of real impact on the world your character lives in.

Now, by my second-hand hearing, Guild Wars 2 has been much more successful than Guild Wars 1 was from a monetary standpoint. I don't know if that's because of the more traditional design (granted, with plenty of non-traditional mechanics thrown in the mix) or a result of something else. Either way, I'm reading these words differently today than I was the last time I set eyes on them.

Saturday, January 16, 2016

Blending Pixels

A task at work has recently required me to do some graphics stuff. I'll be the first to admit that as a learned discipline I'm not a graphics programmer. I enjoy almost any kind of programming and in my hobby work I find graphics programming especially fun. But simple things in that realm of the programming world I am embarrassingly ignorant of. Even something like... blending two pixels.

But! That wasn't going to stop me from researching and validating the right way to alpha blend two pixels together.

Now in reality, the task isn't so much for pixels, but rather for heightfield data. Now often heightfield data is just a single-channel pixel anyway, but in our engine heights are represented as floats. To expand the feature set of our terrain engine, our heightfield data needed to be modified to include layers and an alpha channel. So each heightfield sample is now two channels -- height and alpha.

Of course, googling and grabbing an algorithm isn't all there was to it. I wasn't really going to be satisfied with the answers I found unless I could test it myself. Sure, I could plug the code in and see what results I get, but it was much faster to throw up my favorite web app -- Desmos Graphing Calculator.

Compared to my last documented use of Desmos, this is absurdly straightforward. I'm visualizing the alpha channel along the x axis and the height channel along the y axis. I'll admit, It's a bit disorienting until you get used to it. Regardless, a/h3 is the resulting value. a/h1 and a/h2 are the inputs. Alpha values are normalized, so we're working with ranges between 0 and 1.

The formula for the output ends up looking like this in code. As usual this code is for illustrating the formula -- not for being performant code.

struct HeightSample
    float height, alpha;

void Blend(HeightSample& out, const HeightSample& a, const HeightSample& b)
    out.height = a.alpha * a.height + b.alpha * (1.0f - a.alpha) * b.height;
    out.alpha = a.alpha + ((1.0f - a.alpha) * b.alpha);
It's interesting to note that while the alpha blending is a commutative operation (that is, if you swap the input values, the output is the same), the same it not true of the height channel. This is because sample const HeightSample& a is considered the "top" pixel. They are not even commutative if both alpha values are 0.5. This isn't particularly intuitive, so I thought it was worth mentioning.