A lot of the initial responses we get about Review Board go something like this:

Review Board looks cool, but it’s designed for pre-commit reviews. I don’t like the idea of needing review before committing, for a number of reasons. Can my use case be supported?

The short answer to this question is “Yes, it’s supported, but our UI isn’t optimized for that case.” In particular, it requires using the post-review command line tool to create the review request instead of having a nice web UI for selecting a revision to review, and it’s almost completely undocumented.

While we’re interested in generalizing Review Board into a tool which supports a variety of workflows, I’m not writing this today to explain how to make it work. I’m writing this to explain why all of you who think this are wrong.

Before I finish my attempts to offend everyone, some people should leave the room:

If you’re working in an industry such as telecom or aerospace, and you have a very intensive code review process to ensure perfection, this probably doesn’t apply. I appreciate your hard work ensuring that the airplane I’m flying in won’t fall out of the sky, and that I can call 9-1-1 when needed.

Next, if you’re not using version control, stand up, hang your head in shame, and go set up a server. SVN, Git, Mercurial, it doesn’t really matter.  Visual SourceSafe is maybe a bad choice, though. Seriously, version control is number-freaking-one on the famous Joel Test. Once you’re up and running with that (get a bug tracker too, while you’re at it) you can start thinking about code review.

Okay, now we’re down to just those enlightened, version-control-using hackers (ahem, software engineers) working on the sort of normal software that we do. I promised to offend the rest of you. Here we go.

It’s often kind of hard to define code review, since so many people do so many different things. It covers the spectrum from “occasionally glancing at a diff when a commit message sounds dubious” to those autopilot engineers I already told to leave. In the absence of any sort of authoritative definition, I’ll go with Wikipedia’s:

Code review is systematic examination (often as peer review) of computer source code intended to find and fix mistakes overlooked in the initial development phase, improving both the overall quality of software and the developers’ skills.

Why should I care?

That’s probably as good a definition as any, and I almost fell asleep before I finished reading it. Yawn. What the heck is this good for, anyways?

The big, super important reason anyone does code review is finding mistakes.  We all make mistakes, sometimes. Even those superhuman hackers who seem to write 10x as much code as everyone else where every line looks perfect. It’s really easy to stop really looking at code after having read through it a few times. Ever spent 10 minutes trying to get something to compile before noticing that you missed a semicolon? A child could look at that code and tell you what’s wrong in seconds. It’s not that you’re dumb (well, maybe), it’s probably that you stopped reading and started remembering.

I remember a paper I wrote in college. I think it was about medieval Japan. My mother offered to read through it for me, and I wasn’t about to refuse some free editing, so I sent it to her. In one sentence, I had two “the”s. That’s right. It said something like “In the the Heian court…”. I must have read that paragraph twenty times before I sent it, and I had just stopped paying attention to what was on the page. I said that even superhuman hackers make mistakes, and that means even me.

Finding mistakes is pretty obviously a good thing, I think. After several decades of using software, people have pretty low expectations for quality. You wouldn’t believe how often I tell people where I work and they start talking my ear off about how our software never crashes, and how it’s such a different experience from, well, almost everything else they have to use. It turns out that writing software is really hard. Shocking, I know.

The next big reason for code review is training. Most large software projects have some underlying architecture and design principles which can take some time to learn. One example that immediately springs to mind is the way we use signals and slots in our UI code at VMware; this sort of asynchronous design can take a while for people to get the hang of, and providing suggestions during code review is a great way to get people up to speed in the context of doing real work.

Code review also helps us improve our truck factor. This is my term for “if you were to get hit by a truck tomorrow, how many people would understand the code you’ve written?” If it’s been peer reviewed, that number will be at least one. Gas prices may be driving down SUV sales, but there’s still a lot of trucks. Be careful out there on the roads, guys.

And last but not least, it’s so we can learn from each other. I’ve lost track of the number of times I’ve seen something I didn’t understand in someone’s code and ended up learning a new trick. Everyone likes new tricks, right?

How to ruin everything

So, overall, code review seems like a pretty good thing. We can improve the quality of our software, train new employees/contributors, and learn stuff all at the same time. Now lets look at how people do it wrong.

The first extreme I’ve mentioned is where everyone gets together and pores over some code to find every problem with it. If you’re doing this and your software doesn’t have lives on the line, stop it. Perfect code is a nice ideal, but “sometimes it is not worth fixing a bug.” There’s no way that an entire codebase can be scrutinized in this detail and still live in the fast-paced world of software.

On the other extreme, the easiest way to do it wrong is just not to do it. Sure, you may get an extra few hours a week to surf pornography and make faces at the dude in the adjacent cubicle, but it’s not gonna help the quality of your software.

So, assuming that you’re somewhere in the middle, and you’re doing code reviews without interrupting your developers’ lives too much with hours of meetings, you’ve probably fallen into one of two camps: pre-commit and post-commit. There are some pros and cons for each, but to me, the choice is obvious. Here’s why.

In a post-commit environment, a developer makes some change to the code and submits it to version control. At this point, usually in an automated fashion, that change is sent out to the other developers. This almost always takes one of three forms:

  1. The commit message and a list of files is sent to a mailing list (lame).
  2. The commit message and a diff is sent to a mailing list (slightly less lame).
  3. The change is automatically made available to a tool like Review Board or Trac (okay, I guess)

At this point, those other developers can look at the code and see if there’s anything they disagree with. The big, major, 600-pound hairy smelly gorilla of a problem with this is that /they don’t/. For example, one of the people who checked out Review Board and asked about post-commit reviews wrote this:

Sure, some bugs can (and do) creep in because they are missed in larger commits where we don’t read the entire email, but I’m not worried because QA should catch those…

It’s exactly the larger commits that need the most review! Any developer writing a three line trivial bugfix patch can convince themselves fairly easily that it’s not going to completely fuck shit up (that’s a technical term). It’s a bit harder when the patch is thousands of lines and implementing complex functionality.

Also note that unless you’re working on a private branch (which is fairly rare), committing that big patch before review has a good chance of breaking stuff. Raise your hand if you’ve done an svn up (p4 sync, git pull, whatever) and the code you pulled down didn’t build. Or crashed in a super obvious place as soon as you tried to run it. Everyone? Yeah.

The reason for this is because most people are amateurs, and those who aren’t sometimes get sloppy. Shortly before I started working at VMware, they split the main perforce depot into three separate main branches. They couldn’t componentize the code, so they decided to componentize the developers into branches based on the organization they worked on, in some meager hope of making it so people could actually get work done instead of spending all day hunting for a changeset that wasn’t broken.

That’s right, with the most amazing group of hackers I’ve ever had the pleasure of working with, they couldn’t keep *main* building cleanly once there were more than a few hundred people working on the code. I’ve heard similar stories from many other engineering organizations that I deeply respect. If they can’t do it, what makes you think that you can?

Realistically, you can’t. Noone can. That’s why we have to take steps to try to prevent it. Steps like branching. And, surprise surprise, code review. And tools. Because if the build is broken, it’s not just you and your QA contacts whose time is wasted, it’s every single engineer you’ve got.

Why would anyone want to do it wrong?

So why do people do this? I’ve seen two common reasons.

REASON NUMBER ONE! “Because that’s how people are used to doing things.” This is a powerful reason, and I won’t deny that it’s very hard to get people to change their ways. However powerful inertia is, it doesn’t mean it’s right. It took a lot of effort to get doctors to start washing their hands, but I’m pretty happy that it changed.

Number two, and this is the big stupid one. “Because we’re a fast-paced development shop, and it would slow us down.” I’ve heard this from several people. It’s particularly hilarious to me, because they talk as if slowing down is a bad thing. How many MP3 players hit the market before the iPod?  How many search engines came and went before Google? Being first to market is great, but being best to market is better.

And finally, a silver bullet

So how do we solve this big morass? Simple. Review code before it’s committed. If you’ve got branching that doesn’t hurt, commit to a branch and review it there, but before it affects anyone other than the original developers. Slow down, pay attention to what you and your teammates are doing, and you may find that you’re not quite so good at what you were doing as you thought. And once you’ve realized that, it’s hard not to want to take every possible step to improve.

Finally, before you comment, keep this in mind: I intentionally wrote this in a confrontational tone for my own amusement, and to encourage people to talk. I’ve been doing this software thing for a while, but I don’t have anywhere near all the answers. Think I’m wrong about all of this? Let me know!