Proper Ways To Do Code Reviews
I’ve been involved in a number of software development projects where one of the main activities includes so-called code reviews. A team…
I’ve been involved in a number of software development projects where one of the main activities includes so-called code reviews. A team lead, or a senior developer would conduct these code reviews. Code reviews are conducted with the intention of detecting defective code. This time-honoured practice is put in place as a quality assurance measure, as a precautionary activity designed to prevent defective code from being deployed in a live environment.
Now, at a first glance, this practice seems completely reasonable. But in reality, it turns out that it usually is a colossal waste of time. There are several reasons why code reviews are a waste of time.
Reason 1: Senior Developers’ Time Is Valuable
When you are fortunate to have an experienced developer on the team (a senior developer, or a team lead), it would be foolish to waste their talent on spending time doing code reviews. Experienced developers are a precious resource that should be utilized for developing quality solutions.
Reason 2: Open Criticism Can Be Detrimental For The Team Morale
Most developers feel the pressure to churn a lot of hand cranked code, as they feel they are being evaluated by the lines of code they write. When working under such pressure, it is quite common to end up delivering code that is of sub standard quality. Whoever then gets to review such hastily produced code will have no difficulties identifying weak spots. Pointing out weak spots in someone else’s code is an ugly contact sport, which always results in bruised egos.
It is counter-productive to expose team members to the harshness of code reviews. Any comments coming from senior team leads are almost always going to be taken personally. This creates bad vibes and is contributing to the team’s low morale.
Nothing positive can be gained from making your team members feel inadequate.
Reason 3: Pointing Out Mistakes Is Not A Guarantee Of Successful Learning
A person who had perfected their coding skills over the years of constant practice can quite easily spot problem areas in the code written by a novice, or even by an intermediate programmer. But merely exposing the problem spots in someone else’s code doesn’t automatically mean that it is a lesson learned, and that next time the same or similar mistakes won’t be repeated. It takes much more than pointing out someone’s mistakes for them to master the skill.
It is therefore doubly counter-productive to get senior developers to review code written by less experienced developers. First, we are wasting precious time of a very capable resource on our team. And secondly, we are making less experienced members of our team feel demoralizingly inadequate.
So How Should Less Experienced Developers Learn From Their Mistakes?
Obviously, we are not advocating turning a blind eye on the substandard code produced by less experienced developers. We cannot do that for two reasons: one, we cannot allow our production code to be compromised by substandard, half-baked code. Two, we cannot let junior and intermediary developers just run amok and never learn from their mistakes. Less experienced team members must be made aware of their mistakes, and must be goaded toward learning from those mistakes. It is only in that way that they will ever get to the point of being senior developers.
But if we’re advising development teams to abstain from doing code reviews, how are we envisioning less experienced developers gaining much needed experience?
Junior and intermediary developers should be allowed to try out different ways of writing their code. But at the same token, they should be made fully aware of the obvious defects and inadequacies of the produced code. However, since raising the quality issues by more experienced team members is inevitably leading toward hurt and bruised egos, how are we to implement this learning process?
We must introduce an impartial facility, a non-judgmental way to evaluating and correcting produced code. Typically, we resort to static code analyzers, such as CodeClimate, for example. If we set our project with facilities such as CodeClimate, we can integrate it with our automated build and pull request processes. When a budding developer makes some change to the codebase and commits the branch to the repo, the automated build and code analysis process reviews the proposed change. If the new code fails to meet the preordained standards that code analysis tool imposes, the changed code goes back to the developer. The code analysis tool with provide the report explaining why did the analyzed code fail the minimum acceptable standards, as well as the suggested best practices for how to fix the defects.
So how’s that different from the code review performed by the senior developer? Well, it is hard to take personally the review conducted by an impartial mechanical facility. That entire affair is a private matter between the developer and the mechanical facility, and the ‘rejection’ should not hurt the team member’s morale. Quite the reverse, catching some glaring and obvious defects before they reach the eyes of other team members is typically viewed as a huge advantage and a great service. The back-and-forth between the developers and the impartial code review facility is a perfect occasion for gaining new skills and learning from one’s mistakes in the privacy of one’s own coding session.
Is There Any Need For A Code Review?
Definitely, but with a proviso. Before we interrupt senior developers with the trifling matters of reviewing novice’s code, a few more processes must be fully satisfied.
First and foremost, any modification/addition that affects the code repo must be instigated from a clearly defined and fully vetted requirements document. We must keep in mind at all times that while functionality is definitely an asset, code is always a liability. As software engineers, we must practice aversion toward adding more code, or modifying the code that is already proven to work fine. In other words, there always must exist a very good reason for making changes to the code repository.
When a developer receives a formal request for change, the first thing that needs to happen is that they must write an executable test that encapsulates the expectations of that request. We call that executable expectation a unit test. Any code that gets committed to the proposed development branch that does not have its accompanying executable expectation test (i.e. a unit test), should never qualify for being a candidate to be added to the production codebase.
Assuming that the above criterion is fully met, the implementation code that satisfies the expectations in the unit test must be proved to not only function as expected, but to also not break any other functionality. For that to happen, an entire suite of unit tests must run. Only if all unit test successfully pass can the proposed development branch be considered a serious candidate for the review.
However, the above is a necessary, but not a sufficient condition for the code review to commence. Even after all the proposed code gets submitted with the corresponding unit tests, and even after all unit tests successfully pass, that is in no way a guarantee that the submitted code is of sufficient quality.
As we mentioned earlier, the modified code now must be processed by the static code analysis facility. This is where the team lead can exercise their judgment by calibrating the code analysis tool to accept/reject submitted code according to some threshold. For instance, if the team lead decided that only code that gets grade B or higher qualifies for a review, the code analysis tool will flatly reject any proposed code that is graded C+ or lower. Right there we have an impartial and very efficient bulwark against the unwanted low quality code.
Okay, so now assuming that all the unit tests successfully pass and that the code analysis tool grades the new code with B+ or higher, is the branch finally ready for the code review?
Well, no, not really. Just having acceptable code coverage and the acceptable code grade doesn’t automatically guarantee desired code quality. We need more scrubbing of the code to happen before we consider pushing it in front of the senior developers.
Why Unit Tests Aren’t Enough?
As is already well established, unit tests are a great way to constrain the design of implemented code. We observe that majority of developers, left to their own devices when writing code, end up with a bloated solution. This bloat is undesirable, and is usually the outcome of tight coupling. It is typically not that easy to produce highly decoupled code at will, which is why unit tests come in quite handy.
When implementing code by writing unit tests first, we are forced to produce lean, self-contained, self sufficient modules. Such modules are then amenable to be arranged in various configurations, thus providing powerful software solutions. So the true magic behind this power lies in radical decoupling and radical decentralization/isolation.
Still, same as we cannot guarantee high quality implemented code that was written without the guidance of unit tests, we cannot guarantee that the unit tests themselves are of any reasonable quality. In other words, same as it is easy to write substandard implementation code, it is equally easy to write substandard unit tests. And as is always the case, the garbage in, garbage out principle applies. If we start our development by attempting to write code that will make substandard unit test pass, well then the resulting implemented code cannot help but be of substandard quality.
Enter Mutation Testing
Pay attention how nowhere in the above scenario do we see any senior, experienced developers get involved. We’re still at the private development session level, where a budding developer is writing code in isolation, hoping that the new code will eventually get accepted and merged into the master branch of the code repository.
So now that we have all unit tests in green (that is, satisfactory pass), we’re still not convinced that the resulting implemented code is worthy of being presented to the senior developer for the review. Why is that?
As we said, we need to ensure that the unit tests themselves are worthy of the senior developer’s attention. How do we do that? Simple — we employ mutation testing.
What mutation testing is should be left for an entirely new post, but for now suffice it to say that without executing the mutation testing process, we cannot be sure that our code is indeed of any acceptable quality.
If the mutation testing fails, the developer gets the report explaining where the defects are and how to fix them. The ball is now in the developer’s court to fix all the outstanding issues before, finally, the developer can arrive at the point where they can create a pull request.
Pull Request — Where The Rubber Meets The Road
So finally we arrive at the code review conducted by a human! All the previous hurdles that we forced the developer to go through have been useful because a) they offered valuable learning lessons, and b) they spared senior developers’ precious time that is then used more wisely on solving more pressing issues.
But when the moment arrives for the senior developer to review the submitted pull request, if the above procedure was respected, there will really be very little to review. By having all the unit and mutating tests pass, we are ensured that the submitted code is as lean as it could be. And by having the automatic build successfully pass, we are ensured that no regression was introduced via the code change.
Finally, by passing the grade threshold, imposed by that static code analyzer, we are ensured that all the corporate coding standards have been fully met.
When we put all these qualities together, we reach the point where we realize that there really isn’t much for the senior staff to review. Aside form a few stylistic remarks that senior developers might suggest, the rest should be golden. If the delivered code is radically decoupled, decentralized and isolated, absence of undesirable bloat is ensured. Future extensibility and reusability is also ensured via the same qualities that the new code embodies.
Thus we kill three birds with one stone: 1. we relieve senior staff from duties to breathe down the novices’ necks, 2. we give novice developers the chance to master the ropes by learning from the automated tools, and 3. we enjoy worry- and defect-free code that delivers new functionality to our product.
Edit: Ted Cohn, who generously reviewed my article, corrected me by adding that QA must intervene after the mutating testing is done, and before the pull request gets opened. It is QA who is the final gate keeper deciding whether the proposed development branch is primed for the pull request or not. Thank you Ted for this awesome point!