Annulling The Billion Dollar Mistake

Tony Hoare’s notorious Billion Dollar Mistake had forced many software engineers to propose all kinds of wildly differing solutions. Most…

Sep 11, 2015

Tony Hoare’s notorious Billion Dollar Mistake had forced many software engineers to propose all kinds of wildly differing solutions. Most of those feel like shoddy patchwork, so we still need a good solid resolution of this serious flaw. But let’s first hear Hoare himself describe how the problem came to be:

“I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.” (Tony Hoare, Null References:The Billion Dollar Mistake, 2009)

So what is this null reference? The definitions vary. Some people claim that null reference is an expression of falsiness. What is meant by falsiness is that while something is not clearly true, it also isn’t clearly false. It’s a maybe.

While the above stab at introducing ‘maybe’ into highly formalized systems may seem reasonable at a first glance, more focused analysis reveals that it is quite unreasonable. A well structured formalized system should not resemble swiss cheese by containing all kinds of holes in its logic. Put slightly differently, well structured formalized system must be articulate.

When we state that something in the system we’ve devised is falsey, we’re actually saying that our system is inarticulate. We’re admitting that we haven’t really thought things through when designing and building our system. Such sloppy engineering leads to unreliable, flakey and faulty systems and services.

How To Build An Articulate System?

Let’s examine the circumstances under which someone might be tempted to declare a null in one’s system:

The value we’re expecting to find in the system hasn’t been provided
The value we’re expecting to find in the system is unknown (as of this moment)
The value we’re expecting to find in the system is unknowable (such as, for example, “is there a beginning of time?”)

Suppose we need to know whether the session dialog has been exstablished between the user and the system. If the user has logged in, the session value is found in the system, thanks to the virtue of the act of logging in. But if the user has not logged in then the session value is not found in the system. At that point, the system is architected to deliver null value to the request ‘give me the name of the logged in user’. Upon receiving the null value, the requestor is startled, taken aback, which often results in system crashes.

The above scenario illustrates the circumstance described above in item 1 (The value we’re expecting to find in the system hasn’t been provided). In such a case, the next logical question would be ‘will this value be provided at a later time?’, to which the logical answer is ‘no idea’ (or null). In both cases, our system is stuck in an ambiguous state, which is a surefire sign of shoddy engineering.

What would be the solution to the above problem? If whoever is using the system is expecting the system to provide certain value, and if that value hasn’t been provided, that arrangement always comes as an unpleasant surprise to the consumer. The onus is always on the creator of the system to guarantee the presence of the unambiguous value. In this case, in answer to the question whether the session dialog has been established, the system must provide a clear yes or no answer. Giving a vague, ambiguous ‘maybe’ (in the form of null value) is a cheap copout which serves only to perpetuate the infamous one billion dollar mistake.

More specifically, upon receiving the request to be given the name of the logged in user, the system should exhibit sturdy logic by either supplying the name of the logged in user, or by replying ‘user not logged in’. But it should certainly not just throw a tantrum and go ‘I wasn’t able to find the session, therefore I’m not able to give you the name of the logged in user, therefore I crash and burn!’

What about the issue described in item 2 — The value we’re expecting to find in the system is unknown (as of this moment)? If we, the creators of the system, know that there is a possibility that the value will be provided at a later time, then providing a null answer, masquerading as ‘maybe’, is definitely wrong. Rather than going for another cheap copout, we must architect our system to offer so-called futures, or promises, when answering such questions. By providing a future, or a promise to the asking client, we are issuing a promissory note, saying that the definitive answer will come in eventually. It is then up to the inquisitive client to decide whether they want to stick around and wait for the promised arrival of an answer, or move on. But such articulate answer should definitely not make them crash and burn, as they often do when being hit by the inarticulate answer that simply says ‘null’.

Finally, item 3 — The value we’re expecting to find in the system is unknowable (such as, for example, “is there a beginning of time?”). Providing a null answer to the question we know is not answerable is very sloppy. Such unanswerable questions, if they’re for whatever reason allowed to be asked of our system, should always be answered with a constant value. Instead of replying with null (basically shrugging our shoulders and saying ‘why ask me?’), we should simply admit the lack of knowledge by providing a constant value, something like ‘unknowable’.

Conclusion

When designing a formal system that must behave in a logical fashion, it is patently incorrect to arrange the foundation of system’s structure around objects. If the system we’ve designed is first and foremost relying on its objects to provide answers to the questions posed by the consumers of the system’s services, then we’ll inevitably inject holes into it whereby there will be various situations (temporally caused or otherwise), when those objects may or may not be present to supply the answers. When an object is not present in the system (for whatever reason), and yet the logic baked into the system is relying entirely on that object’s presence to provide an answer, trouble ensues in the form of the dreaded null.

Instead of building on such faulty foundation, it is advisable to shift our focus and rely entirely on values. Our system should implement sturdy logic that governs mapping and transformation of values; this mapping must be highly formalized, and the mapping/transformation logic must ensure that the system never hits an ambiguous state. That way, the system is not relying on the presence of some object/role, but is instead simply dishing out values when asked some questions. In such an arrangement, it is impossible to ever encounter a situation where the answer such highly articulate system comes up with is this shamefully inarticulate, vague null.

Alex’s Newsletter

Discussion about this post