Draft Flag Driven Development (DFDD)
Software development is a risky endeavor. We are dealing with a complicated machinery that runs by orchestrating many unrelated moving…
Software development is a risky endeavor. We are dealing with a complicated machinery that runs by orchestrating many unrelated moving parts. Even if we manage to create a solid, sturdy system that obeys the intended processing logic, when moving that system to a new environment, no one seems to be certain if it’s going to continue to work flawlessly or not.
Software systems are like furniture — if you move it around often, something’s bound to break.
That unpredictability is posing a lot of challenges to software delivery teams. No matter how much testing and verification goes into a changed system, unpleasant surprises are always possible (and they frequently happen) when such thoroughly verified system gets moved to production.
What’s interesting at this point is that no one seems to be questioning this process of building something in one context, with the intention that it should only run in a different context. In software, we see systems being built in the development context, only to be shipped to the testing context, and then into staging context, or into user-acceptance context, and then finally into the real context — production.
That’s a lot of shipping, a lot of opportunity to waste time and money on dealing with unpredictable (and, I would argue, unnecessary) problems. There’s gotta be a better way!
Why not build in the same context where the system is going to run?
Wait, what? That’s crazy talk, right? You can’t be serious in suggesting (or even hinting at) the possibility of making changes to the system that’s running in production!
Well, I am serious in suggesting that. Look, doesn’t it make perfect sense that, as we are building something, we are also testing it, step by step, in the real world? Why wait to try it out at some later time, and then discover that something got broken? How’s that an agreeable, sensible approach? To me, making something over here with the intention to get it to work over there doesn’t make much sense.
Take painting, for example. Suppose we hire painters to do some interior painting in our house. They receive instructions from us specifying how many walls to paint, what are the dimensions of the surfaces to be painted, which paints to use, what colour, what glossiness. They then they go away and work in their workshop on their development walls. Once the work is done there and the paint has dried, they painstakingly scrape the dried paint and carry it over to apply it on their testing walls. They then verify that the paint looks as expected, and then carefully scrape the paint and apply it to their staging walls. They invite us to review the job, and to hopefully provide them with the user acceptance sign off. Once we sign off on the job, they carefully scrape the paint and ship it to our house where they apply the paint to the real, ‘production’ walls!
Madness, right? And yet, that’s how all software development and delivery projects work nowadays. The question is, why? And also, can’t we find better, less expensive, less error-prone ways to do it?
Same as in the house painting job, where the crew applies the paint directly to the real walls, software development should be done in the real environment.
Oh, but that would be unprecedented!
Any time I bring this topic up with my colleagues and coworkers, they get up in the arms. “We can’t have the same approach in software! You don’t understand, software is a different beast.” Or some such hand-waving denials of my proposed remedy to the ailments of software delivery.
OK, let’s look at some real life precedents. For example, let’s take a look at how am I creating this article right now. I logged into the Medium.com, and clicked on the “Write” button. The system then presented me with a blank page onto which I’ve started adding some content. Notice how I wasn’t forced to go into some ‘sandbox’, into some isolated, cloistered, local environment to be able to make some changes. I am right here, in the Medium production environment, crafting a brand new article. My keystrokes are being recorded and persisted in some Medium persistence layer (I’m assuming somewhere on the cloud, but really, who cares?)
Now, just because my every keystroke is persistent on some Medium servers doesn’t automatically mean that the whole world has the access and the ability to watch me as I type. Why not? Because, at this point, I am creating a so-called Draft. It’s a draft version of the product that I will finally (and hopefully) release for public consumption very soon.
So, while this article is in the Draft mode, only people with access to my Medium account can view it (and at this point, as a single author, only I have access to it).
At a certain point, I may decide I’m done. When that happens, I will simply push the “Publish” button, and my product will magically transform into a resource that will be available to all my consumers.
The beauty in this arrangement is that the functionality, look, and feel of this product is identical in the Draft mode as it is in the Published mode. Which means when I get to the point of publishing it, it is pretty much guaranteed that nothing unexpected will occur. The act of publishing will be a non-event.
Why couldn’t we adopt the same software development and delivery model? Any new changes that are placing the system into so-called “dirty” state are tagged with the ‘Draft’ label, and as such are not interfering with the live system. That way, we can thoroughly test and verify those ‘dirty’ changes, and once satisfied, simply flip the “Publish” flag, and voila! new changes become immediately available without breaking anything in the system.
And of course, if something still goes awry (reality bats last!), the version control system will come to the rescue — just yank us back to the most recent healthy state, no fuss, no muss.
I call this approach Draft Flag Driven Development (DFDD).