A lot of the ini­tial responses we get about Review Board go some­thing like this:

Review Board looks cool, but it’s designed for pre-commit reviews. I don’t like the idea of need­ing review before com­mit­ting, for a num­ber of rea­sons. Can my use case be supported?

The short answer to this ques­tion is “Yes, it’s sup­ported, but our UI isn’t opti­mized for that case.” In par­tic­u­lar, it requires using the post-review com­mand line tool to cre­ate the review request instead of hav­ing a nice web UI for select­ing a revi­sion to review, and it’s almost com­pletely undocumented.

While we’re inter­ested in gen­er­al­iz­ing Review Board into a tool which sup­ports a vari­ety of work­flows, I’m not writ­ing this today to explain how to make it work. I’m writ­ing this to explain why all of you who think this are wrong.

Before I fin­ish my attempts to offend every­one, some peo­ple should leave the room:

If you’re work­ing in an indus­try such as tele­com or aero­space, and you have a very inten­sive code review process to ensure per­fec­tion, this prob­a­bly doesn’t apply. I appre­ci­ate your hard work ensur­ing that the air­plane I’m fly­ing in won’t fall out of the sky, and that I can call 9–1-1 when needed.

Next, if you’re not using ver­sion con­trol, stand up, hang your head in shame, and go set up a server. SVN, Git, Mer­cu­r­ial, it doesn’t really mat­ter.  Visual Source­Safe is maybe a bad choice, though. Seri­ously, ver­sion con­trol is number-freaking-one on the famous Joel Test. Once you’re up and run­ning with that (get a bug tracker too, while you’re at it) you can start think­ing about code review.

Okay, now we’re down to just those enlight­ened, version-control-using hack­ers (ahem, soft­ware engi­neers) work­ing on the sort of nor­mal soft­ware that we do. I promised to offend the rest of you. Here we go.

It’s often kind of hard to define code review, since so many peo­ple do so many dif­fer­ent things. It cov­ers the spec­trum from “occa­sion­ally glanc­ing at a diff when a com­mit mes­sage sounds dubi­ous” to those autopi­lot engi­neers I already told to leave. In the absence of any sort of author­i­ta­tive def­i­n­i­tion, I’ll go with Wikipedia’s:

Code review is sys­tem­atic exam­i­na­tion (often as peer review) of com­puter source code intended to find and fix mis­takes over­looked in the ini­tial devel­op­ment phase, improv­ing both the over­all qual­ity of soft­ware and the devel­op­ers’ skills.

Why should I care?

That’s prob­a­bly as good a def­i­n­i­tion as any, and I almost fell asleep before I fin­ished read­ing it. Yawn. What the heck is this good for, anyways?

The big, super impor­tant rea­son any­one does code review is find­ing mis­takes.  We all make mis­takes, some­times. Even those super­hu­man hack­ers who seem to write 10x as much code as every­one else where every line looks per­fect. It’s really easy to stop really look­ing at code after hav­ing read through it a few times. Ever spent 10 min­utes try­ing to get some­thing to com­pile before notic­ing that you missed a semi­colon? A child could look at that code and tell you what’s wrong in sec­onds. It’s not that you’re dumb (well, maybe), it’s prob­a­bly that you stopped read­ing and started remem­ber­ing.

I remem­ber a paper I wrote in col­lege. I think it was about medieval Japan. My mother offered to read through it for me, and I wasn’t about to refuse some free edit­ing, so I sent it to her. In one sen­tence, I had two “the“s. That’s right. It said some­thing like “In the the Heian court…”. I must have read that para­graph twenty times before I sent it, and I had just stopped pay­ing atten­tion to what was on the page. I said that even super­hu­man hack­ers make mis­takes, and that means even me.

Find­ing mis­takes is pretty obvi­ously a good thing, I think. After sev­eral decades of using soft­ware, peo­ple have pretty low expec­ta­tions for qual­ity. You wouldn’t believe how often I tell peo­ple where I work and they start talk­ing my ear off about how our soft­ware never crashes, and how it’s such a dif­fer­ent expe­ri­ence from, well, almost every­thing else they have to use. It turns out that writ­ing soft­ware is really hard. Shock­ing, I know.

The next big rea­son for code review is train­ing. Most large soft­ware projects have some under­ly­ing archi­tec­ture and design prin­ci­ples which can take some time to learn. One exam­ple that imme­di­ately springs to mind is the way we use sig­nals and slots in our UI code at VMware; this sort of asyn­chro­nous design can take a while for peo­ple to get the hang of, and pro­vid­ing sug­ges­tions dur­ing code review is a great way to get peo­ple up to speed in the con­text of doing real work.

Code review also helps us improve our truck fac­tor. This is my term for “if you were to get hit by a truck tomor­row, how many peo­ple would under­stand the code you’ve writ­ten?” If it’s been peer reviewed, that num­ber will be at least one. Gas prices may be dri­ving down SUV sales, but there’s still a lot of trucks. Be care­ful out there on the roads, guys.

And last but not least, it’s so we can learn from each other. I’ve lost track of the num­ber of times I’ve seen some­thing I didn’t under­stand in someone’s code and ended up learn­ing a new trick. Every­one likes new tricks, right?

How to ruin everything

So, over­all, code review seems like a pretty good thing. We can improve the qual­ity of our soft­ware, train new employees/contributors, and learn stuff all at the same time. Now lets look at how peo­ple do it wrong.

The first extreme I’ve men­tioned is where every­one gets together and pores over some code to find every prob­lem with it. If you’re doing this and your soft­ware doesn’t have lives on the line, stop it. Per­fect code is a nice ideal, but “some­times it is not worth fix­ing a bug.” There’s no way that an entire code­base can be scru­ti­nized in this detail and still live in the fast-paced world of software.

On the other extreme, the eas­i­est way to do it wrong is just not to do it. Sure, you may get an extra few hours a week to surf pornog­ra­phy and make faces at the dude in the adja­cent cubi­cle, but it’s not gonna help the qual­ity of your software.

So, assum­ing that you’re some­where in the mid­dle, and you’re doing code reviews with­out inter­rupt­ing your devel­op­ers’ lives too much with hours of meet­ings, you’ve prob­a­bly fallen into one of two camps: pre-commit and post-commit. There are some pros and cons for each, but to me, the choice is obvi­ous. Here’s why.

In a post-commit envi­ron­ment, a devel­oper makes some change to the code and sub­mits it to ver­sion con­trol. At this point, usu­ally in an auto­mated fash­ion, that change is sent out to the other devel­op­ers. This almost always takes one of three forms:

  1. The com­mit mes­sage and a list of files is sent to a mail­ing list (lame).
  2. The com­mit mes­sage and a diff is sent to a mail­ing list (slightly less lame).
  3. The change is auto­mat­i­cally made avail­able to a tool like Review Board or Trac (okay, I guess)

At this point, those other devel­op­ers can look at the code and see if there’s any­thing they dis­agree with. The big, major, 600-pound hairy smelly gorilla of a prob­lem with this is that /they don’t/. For exam­ple, one of the peo­ple who checked out Review Board and asked about post-commit reviews wrote this:

Sure, some bugs can (and do) creep in because they are missed in larger com­mits where we don’t read the entire email, but I’m not wor­ried because QA should catch those…

It’s exactly the larger com­mits that need the most review! Any devel­oper writ­ing a three line triv­ial bug­fix patch can con­vince them­selves fairly eas­ily that it’s not going to com­pletely fuck shit up (that’s a tech­ni­cal term). It’s a bit harder when the patch is thou­sands of lines and imple­ment­ing com­plex functionality.

Also note that unless you’re work­ing on a pri­vate branch (which is fairly rare), com­mit­ting that big patch before review has a good chance of break­ing stuff. Raise your hand if you’ve done an svn up (p4 sync, git pull, what­ever) and the code you pulled down didn’t build. Or crashed in a super obvi­ous place as soon as you tried to run it. Every­one? Yeah.

The rea­son for this is because most peo­ple are ama­teurs, and those who aren’t some­times get sloppy. Shortly before I started work­ing at VMware, they split the main per­force depot into three sep­a­rate main branches. They couldn’t com­po­nen­tize the code, so they decided to com­po­nen­tize the devel­op­ers into branches based on the orga­ni­za­tion they worked on, in some mea­ger hope of mak­ing it so peo­ple could actu­ally get work done instead of spend­ing all day hunt­ing for a change­set that wasn’t broken.

That’s right, with the most amaz­ing group of hack­ers I’ve ever had the plea­sure of work­ing with, they couldn’t keep *main* build­ing cleanly once there were more than a few hun­dred peo­ple work­ing on the code. I’ve heard sim­i­lar sto­ries from many other engi­neer­ing orga­ni­za­tions that I deeply respect. If they can’t do it, what makes you think that you can?

Real­is­ti­cally, you can’t. Noone can. That’s why we have to take steps to try to pre­vent it. Steps like branch­ing. And, sur­prise sur­prise, code review. And tools. Because if the build is bro­ken, it’s not just you and your QA con­tacts whose time is wasted, it’s every sin­gle engi­neer you’ve got.

Why would any­one want to do it wrong?

So why do peo­ple do this? I’ve seen two com­mon reasons.

REASON NUMBER ONE! “Because that’s how peo­ple are used to doing things.” This is a pow­er­ful rea­son, and I won’t deny that it’s very hard to get peo­ple to change their ways. How­ever pow­er­ful iner­tia is, it doesn’t mean it’s right. It took a lot of effort to get doc­tors to start wash­ing their hands, but I’m pretty happy that it changed.

Num­ber two, and this is the big stu­pid one. “Because we’re a fast-paced devel­op­ment shop, and it would slow us down.” I’ve heard this from sev­eral peo­ple. It’s par­tic­u­larly hilar­i­ous to me, because they talk as if slow­ing down is a bad thing. How many MP3 play­ers hit the mar­ket before the iPod?  How many search engines came and went before Google? Being first to mar­ket is great, but being best to mar­ket is better.

And finally, a sil­ver bullet

So how do we solve this big morass? Sim­ple. Review code before it’s com­mit­ted. If you’ve got branch­ing that doesn’t hurt, com­mit to a branch and review it there, but before it affects any­one other than the orig­i­nal devel­op­ers. Slow down, pay atten­tion to what you and your team­mates are doing, and you may find that you’re not quite so good at what you were doing as you thought. And once you’ve real­ized that, it’s hard not to want to take every pos­si­ble step to improve.

Finally, before you com­ment, keep this in mind: I inten­tion­ally wrote this in a con­fronta­tional tone for my own amuse­ment, and to encour­age peo­ple to talk. I’ve been doing this soft­ware thing for a while, but I don’t have any­where near all the answers. Think I’m wrong about all of this? Let me know!