Why I /do/ document and unit test

I’ve just read Justin’s Why I don’t document or unit test post that explains the way he works, so here’s a counterpoint from the way I work. I’ve inherited not documenting and not unit testing from Justin for the Psi project, where it’s fairly hard to change, and at my day job I unit test and document quite a lot, so I’ve worked with both approaches - this post mimicks Justin’s, as I would have written it - it will only make sense immediately after reading his article.

Why I document and unit test

I write documentation and create unit tests, because they’re great things. Software projects are, almost universally, written to address a perceived problem, and fail if they can’t address this need, so they need to be tested. Similarly, very little software is write-only, and if someone needs to read it, it needs to be understandable; it needs documentation. Not all projects satisfy these criteria, but those that do are successful and fun to work with. Maybe that’s because you have some degree of certainty that your code does what you want it to, maybe it’s because you can stand straight and say “I did what I thought was best” and not feel guilty, maybe it’s because you can show your code to people and not have to make excuses, or explain what your method ‘doSomethingWithSideEffects()’ does, maybe it’s because your unit tests caught a critical bug that you just know would have taken you days to find in 6 months without them, maybe it’s just a matter of pride in your craft as a developer. Tests are particularly gratifying - when they pass you can move on, and when they fail you save yourself a lot of heartache (and headache) in the future.

Or maybe it’s because documentation and tests add such huge value to everything else that you’re doing. I’ve worked (and continue to) in both professional and open-source environments where code is unit tested and sufficiently documented - I suspect that it takes a brave person to notice that an undertested, underdocumented project is a well of unproductivity and to invest the manpower to turn it around, usually in the face of considerable pressure for more, more, more, now, now, now. On the other hand, I can’t imagine any type of person who’d deliberately go the other way. I think the biggest problem for a lot of people is legacy - it’s hard to update your practices when you’re knee deep in code that’s certifiably untestable, but I never hear those that do as much as they are able to do complaining about the results.

Deferring documentation and testing is a misguided attempt to get the most bang for your development buck this morning by sacrificing your productivity this afternoon. By the time this afternoon comes, you find yourself wondering why you don’t have the time anymore to be productive. It’s easy to be lazy about these things if you don’t believe in them, but if you don’t believe your code is worth testing are you saying that you don’t believe the code is worth writing? If it’s worth writing something, surely it’s worth writing the right thing. Do you find yourself saying that you don’t want to add features now because you’re going to rewrite it later? How bad must the code quality have been for you not to trust yourself to add a new feature without breaking something? Yes, there is an overhead writing code that can be unit tested - that overhead is the overhead of writing code that works. If you find yourself resisting testing, is it because you know that your code is broken, but you can’t bear to find out how badly? Would unit testing your code show that you wasted a lot of time that could have been saved if you’d written the tests before? If you do fall into this trap, then you’re going to have a hard time getting out of it, because you’ll be wondering how much time you’ve wasted. Ostrichs aren’t the only creature that (mythically) can bury its head in the sand.

I document my code, because I like my design

Do you find yourself writing a big paragraph of Doxygen, or Javadoc comment for your method? Why? Shouldn’t your method name tell you what it does? Shouldn’t your method parameters’ names, plus their type (if such is your language) tell you how to call it? If not, why not? If your method is called “addTwoNumbers(int num1, int num2)”, you shouldn’t be documenting to say that, by the way, it also resets some internal mutable state; if you need to put that comment in, the smart money says that your code’s broken. Why bother spending the time with sensible naming and self-documenting code when you don’t even know that you need this method? Back up a minute - why are you writing this method if you don’t know that you need it yet? Yes, requirements for a project change (and we embrace those changes, as Agile developers), and if code goes out of (project) scope, we’re glad that our methods were small, single-responsibility methods so we don’t spend time splitting out the functionality we don’t need from that we do. We’re also glad that we wrote no more code for that requirement than we were sure that we needed.

It happens in software that you have to refactor stuff. That’s just the way it goes. Requirements are never right first time, but at least we know that, and we keep checking them with the users of the system, and the users of our methods, as appropriate. When you know that your requirements are going to change, you have no business writing code that’s going to be hard to refactor, or code that doesn’t need to be there. You had your requirements from your user, and turned those into the minimal API requirements for your code, wrote your unit tests to assert that the minimal requirements are met, and wrote the minimal code to satisfy the tests, and therefore the requirements. You really don’t want to be committing to such a heavy design up front that it’s going to be a hassle to update it. As the saying could go: code that needs documentation is worse than wrong documentation, is worse than no documentation.

In general, if the code is such that other developers are able to tell you what your code does from the class, method and variable names, your energies are being well spent.

I test, because I care about my implementation

And I always care about my implementation - you do too, or you wouldn’t be so far down this gargantuan post. There’re only three reasons that immediately come to my mind why you might not care about your implementation:

  1. You hate the task that your code is written for. This is going to be a problem if you’re captured by an evil genius, ideally with a cat, and are working on an advanced intelligence, or orbital laser platform, to be used to enslave all mankind, but for most of us this doesn’t apply.

  2. You simply don’t care about the result. You’re demotivated, and if it doesn’t work then it’s someone else’s problem to solve, the testers will pick up the problems and you’ll be on holiday when the bug reports come in. This isn’t you.

  3. You do care about your implementations, but you don’t know how to do a job you’re proud of, so you tell yourself that you don’t care about them as a self-preservation mechanism. This may be you, it’s been most of us. Thankfully there are smart people who’ve written books giving advice on how to be a better developer. In fact, titles like Agile Software Development, Clean Code, Extreme Programming Explained, and The Pragmatic Programmer make this post redundant - read them and save yourself reading the rest of my post. Books may not make you smarter, but they’re not going to make you a worse developer.

It’s possible to get tied up with design, and delegate implementation to being a second class entity. Think about this for a moment. What’s design, and what’s implementation? When class A calls a few methods on class B, what is that code - design or implementation? It may be a design for class B, but it’s the implementation of class A. Don’t get too hung up on which is which - generally the design is driven by the needs of the implementation. You know what methods class B needs once you’ve written class A’s method that makes the calls. Divorcing design and implementation is a great way to make sure your code isn’t going to do quite what it was meant to. I’ve done this, you may have done this. Live and learn. This is just another case of ‘code what you need’. If you only code what you need, you only test what you need (equally: if you only test what you need then you only code what you need).

Some people will start testing and find that they can’t test it all. If this is you, and you’re finding that your methods have side-effects that you can’t test, then you’re one of the lucky ones: congratulations! Your unit testing has found problems with your code even before you’ve written your first line of a test. There is a definite desire for mutable objects (objects whose state can be changed through method calls) - certainly, Java code would look very different if ArrayLists were immutable, but this doesn’t mean that your objects should be mutable. In the rare cases where your object /does/ need to be mutable, make it explicit: don’t have a method that does one thing superficially, but also changes state. Oh, and don’t make classes such that your objects can get into an inconsistent state, either - a common example of this from my past is constructing an object, but requiring another ‘set’ method call before the object is usable. I kick myself when I wonder why I didn’t pass the value in the constructor. If you can’t unit test your code, you probably need to reconsider your approach. The obvious counter to this is a dependency on external state (particularly databases, or networks) - unit testers get around this with mock objects that look like the external state object, but are deterministically controllable from the tests, and you can easily fall in love with them very quickly.

In general, code that works will continue working if you don’t touch it. Psi’s a great case in point - if it didn’t consist of a majority of working code, then nobody would use it. It hasn’t traditionally had any unit tests, and that’s taken its toll on development - development cycles are long because no-one dares to refactor the code to support new features, and once the features are in, release candidate cycles are typically even longer because it’s difficult (and demoralising) to hunt down the bugs that have been introduced, especially when the bugs often cause crashes in unrelated code, due to mutable objects. Until we build up adequate tests, the cycle will continue, as people spend their energies on new features, avoiding fixing bugs that’re difficult to reproduce, or are in code that is perceived to be risky to modify. That doesn’t mean that Psi’s buggy, or that it will become buggy - we don’t leave the Release Candidate phase until we have reasonable confidence that we’ve got all those bugs out, and we do a pretty good job of it (I hope). It’d just be nice to not need to introduce them, or spend so much time in RC getting rid of them.

There are no exceptions (almost)

It’s easy in the early stages of a project to neglect testing, or to write code whose behaviour isn’t obvious, but this is the greatest of false economies - when your early code doesn’t work, your later code isn’t going to work. It’s also easy to slip into the mindset that unless you can get complete coverage now you shouldn’t test at all. If, for some reason, you can’t get as much coverage as you want, every behaviour that you test is another bug that can’t happen.

About Documentation

Above, I claim that all code should be self-documenting. I stand by that - it should be obvious what code does from its method name and signature. That doesn’t mean that I believe code should never have a Doxygen comment, or a Javadoc - some people (that I respect) hold this opinion, and I may too, some day. At the moment, though, I can still see a value in a greater explanation in a comment, especially if there’s a business requirement that makes something non-obvious - you know the sort of thing “oh, yeah, all numbers must always be rounded up, 2.1 is really 3”. Why you have to do something, where it’s not obvious, is a great thing to document. If you have to document what you’re doing then you probably need to consider why it’s not obvious from your code.

Conclusion

Deferring documentation or unit tests is not always about laziness. It’s about efficiency. Note my usage of the word “defer” - someone else will need to understand your code eventually, and you’re going to need to do the testing, either manually or automatically, to make it work someday. You’re sacrificing your efficiency later for a tiny gain now. It’s clear to me that it’s necessary to prioritise the tasks with highest impact first, with your tests to ensure that they work, and to only put effort into the next features once all the problems with the first code are solved, and the probability of later needing to expend considerable effort fixing them, or being unable to modify them for fear of breaking them, is low.

Happy coding!

Here ends my version - thanks to Justin for providing the original as a talking piece, which gave me a chance to get some thoughts down on electrons that previously I hadn’t taken the time to express.