Wednesday, August 27, 2014

Programming as a communication to future programmers (including yourself)

I was recently in a discussion with one of my colleagues about some of the finer points of developing in C++. To be frank, we talk about this nearly every day, but this conversation in particular stuck out to me.

The particular thing we were talking about was how the code you write can clue in future developers into how something ought to be used or how something works. In C++, this is pretty important because C++ allows developers to accomplish tasks in a variety of ways. The particular example we were discussing was when to pass by reference and when to pass by pointer.

Surely, there have been thousands of discussions on whether to pass by reference or pointer. Just google it and you'll see the evidence of that. However, instead, we took the approach of "When should I accept by reference or by pointer?" Moreover, what does this say about how I want to use the references or pointers you are giving me?

I phrase the question this way because the user of a function doesn't get to choose how the function's signature is created -- the used function chooses that. By creating your function in a particular way, you are communicating how it will be used by user code. As an example, when thinking about passing by reference or by pointer, the two have rules in the language that permit certain things and restrict others. Therefore, depending on how you accept an object, you're communicating to the user how you expect to be able to use it.

If you are accepting by pointer, you are saying that
  1. It is okay for the object to not exist, ie, it's optional. Pointers can be null, so only accept a pointer if you have a handle-case for the object to not exist. In fact, it's probably best to put all pointer arguments at the end of the function signature with default " = nullptr" values in each one, making the function more convenient to use.
  2. The object, if not null, will at least live as long as the function's scope. Once you leave the function's scope, the object could at any time be deleted. This means that the function shouldn't store off the pointer somewhere else for later use. Of course, this much isn't even guaranteed because...
  3. The function can delete the pointed-to object. This is often my biggest fear when using other people's code and passing objects to them by pointer. Does the function expect to take over ownership? Will the function delete my object? Of course, some functions explicitely are intended to do that and even say so in their name.

As opposed to accepting a reference where...
  1. The object is guaranteed to exist.1 This is great because then there's no null-checking, the user knows that the object must exist, and if they attempt to dereference a pointer in order to pass it by reference, the dereferencing null exception happens on their turf, not mine -- it's their mistake, not a bug in my API. However, like a pointer, there is no guarantee that the object will exist longer than the scope of the function, so don't store it off or even the object's address off anywhere.
  2. The object will live for the entire scope of the function and you are guaranteed to have access to it for the entire duration of the function. Whereas with a pointer, I could re-assign it to point to a different object, with a reference I can't reassign it. Moreover, this means that...
  3. A reference cannot be deleted (with exceptions). Try calling delete on a reference. It doesn't work. Of course, you could call "delete &reference;", but let's be honest, if you're doing that, you have no business being a developer at all.

And there's more about communicating to your user how you will use objects passed to them. Using the const keyword is a great way to promise to your user code that you won't modify the object passed to them (unless you cast it to a non-cost, at which point somebody needs to cut you... deep... and in a main artery).

Part of me feels dumb writing this. Above, I said that this was a finer point of developing in C++, but honestly it's a pretty blunt topic. This isn't an obscure practice. C++ developers have known these points since the dawn of the language. Heck, compilers even auto-generate copy constructors that take a const reference. Why? Because they never want to copy null and they want to guarantee that the copy constructor won't modify the original object.

Yet, I see time after time after time, code written that doesn't follow these simple rules, and it's often by developers who've been at it for well over a decade. Instead of guaranteeing the existence of an object by asking for it by reference, they'll ask for it by pointer and put an assert at the very top of the function. That will prevent shit from hitting the fan in debug, but when the shit hits the fan in release, you're shit outa luck and will likely have a harder time finding the bug because release dumps never give you enough information (optimizations that make the callstack not match your code, only getting stack memory, etc).

In my mind, it all comes down to subtle communication and usability. I'm a tools programmer, so I think a lot about usability. What many programmers don't think about is that the first tool is the code itself. The usability of the code also needs to be considered in detail and that includes what it communicates to users of the code.

A comparison might explain it best. If you have a property of an object in a tool you're writing and that property can be one of a list of values that do not relate to each other in scale, the UI representation you'd likely create for that is a combobox. Why? Because there are three discrete options and you can only pick one. But what if you create a slider? Well a slider can snap to 0, 33, 66, and 100% to represent the four values and only one can be selected. But a slider communicates a relationship between the possible values. A slider is, therefore, confusing.

Good uses of sliders: Graphic quality (scaling from low to high), view distance (scaling from near to far), field of view (scaling from narrow to wide).

Good uses of comboboxes: Physics Interpolation (none, interpolate, extrapolate), a character's profession (warrior, mage, ranger, priest), light type (point, spot, direction, area).

If you were to use a slider for the fields that lend themselves to comboboxes, the user could get confused. Wait, is a priest the best profession because it's highest on the slider? Is a point light not as good as a directional light because it's lower on the slider? Of course not, but when you present a slider to something that fits a combobox, the user will get confused and be unsure of how to use the tool.

This is what it's like when seeing a function call that is asking for an object in the wrong way. Possibly worst of all, it slows down future development time as confused programmers stumble their way through learning your API.

Write so as to communicate. When you write, think, "what am I telling future developers about my code?" and use what is most appropriate.

1 While a reference is not absolutely guaranteed to exist, it is at least guaranteed to never be assigned to null statically. Because we're talking about programmer's static typing here, for all intents and purposes, a reference says "I am expecting this object to exist".