Posts Tagged rant

Code Review

A lot of the initial responses we get about Review Board go something like this:

Review Board looks cool, but it’s designed for pre-commit reviews. I don’t like the idea of needing review before committing, for a number of reasons. Can my use case be supported?

The short answer to this question is “Yes, it’s supported, but our UI isn’t optimized for that case.” In particular, it requires using the post-review command line tool to create the review request instead of having a nice web UI for selecting a revision to review, and it’s almost completely undocumented.

While we’re interested in generalizing Review Board into a tool which supports a variety of workflows, I’m not writing this today to explain how to make it work. I’m writing this to explain why all of you who think this are wrong.

Before I finish my attempts to offend everyone, some people should leave the room:

If you’re working in an industry such as telecom or aerospace, and you have a very intensive code review process to ensure perfection, this probably doesn’t apply. I appreciate your hard work ensuring that the airplane I’m flying in won’t fall out of the sky, and that I can call 9-1-1 when needed.

Next, if you’re not using version control, stand up, hang your head in shame, and go set up a server. SVN, Git, Mercurial, it doesn’t really matter.  Visual SourceSafe is maybe a bad choice, though. Seriously, version control is number-freaking-one on the famous Joel Test. Once you’re up and running with that (get a bug tracker too, while you’re at it) you can start thinking about code review.

Okay, now we’re down to just those enlightened, version-control-using hackers (ahem, software engineers) working on the sort of normal software that we do. I promised to offend the rest of you. Here we go.

It’s often kind of hard to define code review, since so many people do so many different things. It covers the spectrum from “occasionally glancing at a diff when a commit message sounds dubious” to those autopilot engineers I already told to leave. In the absence of any sort of authoritative definition, I’ll go with Wikipedia’s:

Code review is systematic examination (often as peer review) of computer source code intended to find and fix mistakes overlooked in the initial development phase, improving both the overall quality of software and the developers’ skills.

Why should I care?

That’s probably as good a definition as any, and I almost fell asleep before I finished reading it. Yawn. What the heck is this good for, anyways?

The big, super important reason anyone does code review is finding mistakes.  We all make mistakes, sometimes. Even those superhuman hackers who seem to write 10x as much code as everyone else where every line looks perfect. It’s really easy to stop really looking at code after having read through it a few times. Ever spent 10 minutes trying to get something to compile before noticing that you missed a semicolon? A child could look at that code and tell you what’s wrong in seconds. It’s not that you’re dumb (well, maybe), it’s probably that you stopped reading and started remembering.

I remember a paper I wrote in college. I think it was about medieval Japan. My mother offered to read through it for me, and I wasn’t about to refuse some free editing, so I sent it to her. In one sentence, I had two “the”s. That’s right. It said something like “In the the Heian court…”. I must have read that paragraph twenty times before I sent it, and I had just stopped paying attention to what was on the page. I said that even superhuman hackers make mistakes, and that means even me.

Finding mistakes is pretty obviously a good thing, I think. After several decades of using software, people have pretty low expectations for quality. You wouldn’t believe how often I tell people where I work and they start talking my ear off about how our software never crashes, and how it’s such a different experience from, well, almost everything else they have to use. It turns out that writing software is really hard. Shocking, I know.

The next big reason for code review is training. Most large software projects have some underlying architecture and design principles which can take some time to learn. One example that immediately springs to mind is the way we use signals and slots in our UI code at VMware; this sort of asynchronous design can take a while for people to get the hang of, and providing suggestions during code review is a great way to get people up to speed in the context of doing real work.

Code review also helps us improve our truck factor. This is my term for “if you were to get hit by a truck tomorrow, how many people would understand the code you’ve written?” If it’s been peer reviewed, that number will be at least one. Gas prices may be driving down SUV sales, but there’s still a lot of trucks. Be careful out there on the roads, guys.

And last but not least, it’s so we can learn from each other. I’ve lost track of the number of times I’ve seen something I didn’t understand in someone’s code and ended up learning a new trick. Everyone likes new tricks, right?

How to ruin everything

So, overall, code review seems like a pretty good thing. We can improve the quality of our software, train new employees/contributors, and learn stuff all at the same time. Now lets look at how people do it wrong.

The first extreme I’ve mentioned is where everyone gets together and pores over some code to find every problem with it. If you’re doing this and your software doesn’t have lives on the line, stop it. Perfect code is a nice ideal, but “sometimes it is not worth fixing a bug.” There’s no way that an entire codebase can be scrutinized in this detail and still live in the fast-paced world of software.

On the other extreme, the easiest way to do it wrong is just not to do it. Sure, you may get an extra few hours a week to surf pornography and make faces at the dude in the adjacent cubicle, but it’s not gonna help the quality of your software.

So, assuming that you’re somewhere in the middle, and you’re doing code reviews without interrupting your developers’ lives too much with hours of meetings, you’ve probably fallen into one of two camps: pre-commit and post-commit. There are some pros and cons for each, but to me, the choice is obvious. Here’s why.

In a post-commit environment, a developer makes some change to the code and submits it to version control. At this point, usually in an automated fashion, that change is sent out to the other developers. This almost always takes one of three forms:

  1. The commit message and a list of files is sent to a mailing list (lame).
  2. The commit message and a diff is sent to a mailing list (slightly less lame).
  3. The change is automatically made available to a tool like Review Board or Trac (okay, I guess)

At this point, those other developers can look at the code and see if there’s anything they disagree with. The big, major, 600-pound hairy smelly gorilla of a problem with this is that /they don’t/. For example, one of the people who checked out Review Board and asked about post-commit reviews wrote this:

Sure, some bugs can (and do) creep in because they are missed in larger commits where we don’t read the entire email, but I’m not worried because QA should catch those…

It’s exactly the larger commits that need the most review! Any developer writing a three line trivial bugfix patch can convince themselves fairly easily that it’s not going to completely fuck shit up (that’s a technical term). It’s a bit harder when the patch is thousands of lines and implementing complex functionality.

Also note that unless you’re working on a private branch (which is fairly rare), committing that big patch before review has a good chance of breaking stuff. Raise your hand if you’ve done an svn up (p4 sync, git pull, whatever) and the code you pulled down didn’t build. Or crashed in a super obvious place as soon as you tried to run it. Everyone? Yeah.

The reason for this is because most people are amateurs, and those who aren’t sometimes get sloppy. Shortly before I started working at VMware, they split the main perforce depot into three separate main branches. They couldn’t componentize the code, so they decided to componentize the developers into branches based on the organization they worked on, in some meager hope of making it so people could actually get work done instead of spending all day hunting for a changeset that wasn’t broken.

That’s right, with the most amazing group of hackers I’ve ever had the pleasure of working with, they couldn’t keep *main* building cleanly once there were more than a few hundred people working on the code. I’ve heard similar stories from many other engineering organizations that I deeply respect. If they can’t do it, what makes you think that you can?

Realistically, you can’t. Noone can. That’s why we have to take steps to try to prevent it. Steps like branching. And, surprise surprise, code review. And tools. Because if the build is broken, it’s not just you and your QA contacts whose time is wasted, it’s every single engineer you’ve got.

Why would anyone want to do it wrong?

So why do people do this? I’ve seen two common reasons.

REASON NUMBER ONE! “Because that’s how people are used to doing things.” This is a powerful reason, and I won’t deny that it’s very hard to get people to change their ways. However powerful inertia is, it doesn’t mean it’s right. It took a lot of effort to get doctors to start washing their hands, but I’m pretty happy that it changed.

Number two, and this is the big stupid one. “Because we’re a fast-paced development shop, and it would slow us down.” I’ve heard this from several people. It’s particularly hilarious to me, because they talk as if slowing down is a bad thing. How many MP3 players hit the market before the iPod?  How many search engines came and went before Google? Being first to market is great, but being best to market is better.

And finally, a silver bullet

So how do we solve this big morass? Simple. Review code before it’s committed. If you’ve got branching that doesn’t hurt, commit to a branch and review it there, but before it affects anyone other than the original developers. Slow down, pay attention to what you and your teammates are doing, and you may find that you’re not quite so good at what you were doing as you thought. And once you’ve realized that, it’s hard not to want to take every possible step to improve.

Finally, before you comment, keep this in mind: I intentionally wrote this in a confrontational tone for my own amusement, and to encourage people to talk. I’ve been doing this software thing for a while, but I don’t have anywhere near all the answers. Think I’m wrong about all of this? Let me know!

Tags: , , ,

A certain number

Warning and disclaimer: vitriolic rant ahead; I am not a lawyer, just angry.

Recently I’ve been seeing a lot of a certain number on the internet. Specifically, the processing key for most of the HD-DVD movies published so far. And I’m getting really sick of it.

Most of the people copying this key did so under some delusional notion of “free speech” or “numbers not being copyrightable”. Under current US law (regardless of what you think of it), this number is part of an “circumvention device”, and as such is not under copyright, but is it also not protected speech. This number has no artistic merit or value beyond its purpose towards breaking the DRM on HD-DVD products. As far as I’m concerned, it’s just garbage that I’ve had shoved in my face constantly for the last few days.

By posting this number on your blog/web site/google notebook/tattoo-on-your-chest (I’m not making that up), you are not making a valuable contribution to the world. You are not even inconveniencing the AACS-LA or their lawyers. If you want to make a difference, you have several productive options:

  • Contact your legislative representatives and lobby for copyright/DMCA reform.
  • Put economic and social pressure on the various backers of DRM technology by writing letters and talking to people on the street. Teach people why these technologies are bad. Maybe you’ll even get them to help campaign for open formats.
  • Get out there and produce a competing technology for distribution of digital media that’s so better that everyone will want to adopt it.

I suspect these will be unpopular options, since they actually require some effort. The sort of “civil disobedience” exhibited by plastering this number all over the internet is idiotic. Picketing is effective. Boycotts are effective. Political action is sometimes effective. Whining on the internet is not. Especially when that whining is in violation of existing law. What happens when, in court, useful arguments are thrown out because the person making them was shown to have willfully disregarded existing statutes?

Please, stop polluting my internet with this garbage. Most of you doing this are bright, talented human beings. Go use that talent to create something wonderful.

Tags:

Dear Web 2.0,

There aint enough room for both of us in this town.

Screenshot.png

Love, David.

Tags: ,

Raising my ire and tempting my wrath

I just discovered that the wiki on zero (the host my blog is on) got hit pretty hard by the spammers the last few days — hard enough that I don’t have the energy to clean it up at the moment. I shudder to think what it would be like if I still allowed anonymous edits.

Anyone know of an easily set-up captcha plugin for mediawiki 1.5? Until I have the chance to crack some spammer-skulls, it’s the only thing I can think of to help with the problem.

Tags: ,

Tagging love and community ire

A couple days ago, searching for something cool to hack on, I came back to leaftag. Shortly after I started updating the deskbar handler, Luis Villa posted on his desire for it, and a release got cut. It’s pretty incomplete and buggy, and not very useful right now, but still enough to get a number of people excited.

However, it’s also enough for people to start giving us their opinions. I really don’t know why, but for some reason there’s a large group of people who really want us to use extended attributes to store the tags. Even after giving them a bunch of reasons why it’s not useful for our purposes, they keep trying to tell us how to build our bike shed.

I don’t understand why xattrs are such a big deal, and I don’t understand why everyone cares so much about how our implementation works. They’re never going to touch the code, and as long as it does what they want, why should they care what’s under the hood? I’ve set to see one comment on Christian’s blog that asked for a feature that required xattrs without first saying we should use xattrs. It really felt like all the features they listed were just attempts to justify their position that xattrs are the coolest thing since sliced cheese.

I’m starting to understand why Novell kept Xgl’s development under wraps until things worked.

Maybe I’m only starting to see something that has existed for years, but it really feels like gnome is so encumbered by endless talk, bike shedding and stop energy that very little is happening. The cool things like gimmie, deskbar, nautilus search and leaftag only happen when someone goes off in a corner, shuts themselves out from the community for a while and actually writes some code. The mailing lists and irc channels were recently aflurry with conversation about what “community development” means, and just as quickly as it came, it was gone. A huge amount of time and effort was invested into that discussion, and absolutely nothing came of it. This isn’t an unusual pattern.

I’ve heard it suggested by a couple people that gnome needs a dictator, and I really think that’s the case. At the very least, gnome needs someone who can tell people to shut up and get back to work.

And for all you xattr fanboys, code speaks a lot louder than words. Show us that xattrs provide something useful and good, and we’ll talk. Until then, shut up and get back to work.

I’m usually not one for movie quotations, but I’m considering adopting one as my mantra. “Fuck the bozos!”

Tags: , , ,

GNOME API documentation

No wonder GNOME (and linux?) doesn’t have very many ISVs. The platform really needs a new documentation system. Here’s why.

A couple weeks ago, while working on a gnome-vfs module for Chippy’s super-sexy tagging system, I noticed that gnome-python has no API documentation. No HTML, no docstrings, nothing. In order to find out a method signature, I had to look in the source code and decipher the python/C binding code.

Now, the python bindings are arguably somewhat further out in the gnome orbit. However, for an officially supported binding, this is pretty bad. The documentation for the C gnome-vfs API isn’t much better. When an ISV comes to gnome and takes a look at this, they’re going to run screaming. I know I did, and I’m willing to put up with a lot more of this crap because I know how hard it is.

How hard it is. Think about that for a moment. Setting up gtk-doc for a C module is a black art, full of copying strange build-system stuff and hours spent trying to figure out which XML files are auto-generated and which are safe to edit. If you’re lucky enough to be documenting something like gnome-python, all you have to do is write a bunch of docbook. Let me say it for the record: docbook is a total pain in the ass. I love the idea of using one source to generate HTML, pdf, etc all in one go. This is a well understood problem, and docbook is a great solution to it. Unfortunately, because docbook is such a pain to write, people aren’t doing it, and that’s the real problem.

I decided that because I was already taking the time to figure it out, I’d document it. gnome-vfs isn’t a huge API (and the python bindings are even smaller), yet it’s taking me at least two hours per object to get everything right.

The GNOME release team is requiring full API documentation for new modules, and that’s a step in the right direction from a policy standpoint. However, we need a lot of steps from a technical standpoint. gnome-vfs is pretty close to “fully documented” and yet completely impossible to use. Mono is doing something good here — anybody can edit API documentation, in a wiki-like manner; it’s not hard to find what to edit, and the commitment involved is minimal. I think if we got something similar across all of the GNOME platform (for every language), we might begin to see some real documentation. Writing API documentation isn’t hard. Writing API documentation using the current framework is a nightmare.

Tags: , , ,

paper tigers

Big flame fest going on over at lists.osdl.org regarding print dialogs, “usability” and features. Kind of entertaining reading, kind of frustrating reading.

The paper tiger here is the tendency to assume intent behind any particular result. In this particular case it’s the assumption that options provided by a PPD printer driver aren’t exposed in the gnome print dialog because they’re confusing to the user.

I’ve seen a number of people complain about my own software (xchat-gnome) using similar arguments. “Those idiots! They removed the ability to get user information!” “Those idiots! They removed the ability to detach individual channels from the main UI!” Seeing people I’ve never heard from before say things like that in public forums is painful — most of the time, xchat-gnome development is just me, and I’ve got school, work, and several other projects, in addition to a usually-failing attempt at a social life. A lot of these kinds of problems are nontrivial (such as detaching), and just because something isn’t there doesn’t mean it will never be.

I wonder if anybody really talked to the gnome-print developers about getting UI for the extra PPD options. Judging from my own experience as both a user and a developer, I’d be willing to bet that their answer would be “well, it’s hard and nobody’s figured out how to do it right” rather than “we decided that our users were too stupid to use those features.” I’m not really familiar at all with these projects, so I’m just guessing.

There’s an old adage: never ascribe to malice what can be adequately explained by incompetance. I’d like to propose an equivalent for the open-source world (and gnome in particular). Never ascribe to malice what can be adequately explained by a lack of time and manpower. I can’t think of any free software author who is willing to sacrifice truly useful functionality in exchange for a slightly lower widget count.

In the ML thread, the word “usability” came up quite a bit. For the most part, people seem to be (quite correctly) pointing out that it’s a pretty meaningless word. I got to wondering — what does usability mean to me? I’ve written a lot of GUI software, and I generally consider xchat-gnome to be pretty “usable,” but I’m not sure I ever really came up with a definition. After some thought, here it is: Software is usable when it provides the functionality a user requires to execute a certain task, in a discoverable (or at the very least least well-documented and learnable) way, with said functionality organized in logically and a minimum of extra distraction or clutter.

As always, there are people more eloquent than I who have considered this problem. “A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.”

Tags: , ,

computers suck

Looks like I have another dead hard disk, with another 230GB of data probably gone. It’s possible that I can get it off, but right now I have nowhere to put it, and buying another disk was not high on my list of priorities. It seems like hard disks are the biggest money sink in my life right now, and there is seething involved.

Update: and of course, 2 minutes after posting this, I get the acrid smell of burnt electronics filling my living room. So much for data recovery.

Tags:

guns don’t kill people, video games do

Those of you who pay attention to the media have probably heard about “hot coffee.” This is a mod for Grand Theft Auto: San Andreas which adds hot, steamy, polygon-based sex scenes to the game. The author claims that the content was already there, and Rockstar games, unsurprisingly, is claiming that they did nothing of the sort. Whoever is responsible, I’m really grumpy at the response that this is getting.

In the last few months, the usual political players have been blasting the game for causing a small handful of people to kill other people. This is hardly surprising — violent video games have been blamed for such things, without any real statistically significant proof, ever since violent video games first came out. Before, GTA was a controversial game, but it seemed like it was still accepted by a lot of people. Now, apparently, the game is completely inappropriate for children, and may be granted an “adults only” rating, which pretty much requires that retailers follow the same rules as they would for hardcore porn.

Wait, did I hear myself right? It’s OK for kids to play a game that glorifies killing police officers, but a game that glorifies sex, that’s inexcusable? According to our elected representatives and media watchdog groups, that’s the case.

Never mind that with 10 seconds and an internet connection, teenagers can get access to significantly more explicit pornography featuring actual humans. Never mind that getting access to these scenes requires downloading the mod. Ban GTA! Ban Rockstar Games!

Tags: ,

If anyone needs me, I’ll be in the angry dome

Howard Dean has kicked up a few anthills recently, and the response to these things always amuses me. However, after reading a couple of opinion pieces linked from google news, I’m pretty cranky.

I keep seeing democrats (at least, I’m assuming from their point of view that they’re democrats) talking about how the democratic party is “letting themselves be defined by the republican party” and “this needs to stop”. I’ll agree that the democratic platform has seemed a little weak on the ideals recently, but I can’t fathom how someone can write an opinion piece about how people need to state their opinions without actually stating any opinions. I don’t know if it’s out of some weird sense of political correctness, but these people aren’t helping their own cause at all.

Here. I’ll demonstrate: I’m in favor of migrant workers having access to the same quality of health care as the CEO of a multi-billion-dollar company. I don’t think the government should be allowed to know what books I’ve checked out from the library or what web sites I visit, especially if it can (and in the case of current law, must) be without notifying me. See how easy that was? Convincing people that these are good ideals is just as easy. Showing that the current republican government isn’t doing any of this, and in many cases is doing the exact opposite, is easy too.

“We need to stop letting ourselves be defined by others” is about as useful as saying nothing at all. In the same way that pity can be harmful, jumping on this bandwagon distracts everyone from the real issues at hand, and that hurts. Treating the liberal platforms as victims hurts, especially when it’s coming from within.

I imagine it would be pretty easy to get most people fired up about things like the new patriot act discussions. I can’t believe I haven’t heard someone call the bluff on this. The idea that “we haven’t been abusing these powers” is a valid excuse for giving up fundamental freedoms is preposterous to me. There will always be abuses, and by legally giving the government extreme powers, we’re giving up our ability to fix things when they go wrong. Apparently the “sneak and peek” search powers have been exercised in “only about 1%” of cases, which turns out to be about 160 searches. Doesn’t this number alarm anybody other than me?

At the time of the civil war, people weren’t saying “we have to stop letting slave-owners define our views”, they were saying “slavery is wrong.” The world is fraught with social problems, and when people start actually saying something about it instead of beating around the bush, maybe we can start to solve some of them.

Tags: ,