This is a guest post by Cameron Laird.
Unit tests are powerful. They can provide great value; or if misused, considerable harm.
Here are 10 examples of often misunderstood corners of unit testing likely to appear in your projects, so you can analyze how unit tests can be helpful or go wrong.
1. Tests that help with discovery
One of my favorite qualities of extensive unit tests is that surprises can come soon after a mistake. While working recently on an arithmetic algorithm, an error message in a completely unrelated content management part of the application went bad. It turned out that a week earlier, someone else had manipulated a string result in-place, where a deep copy would have been more appropriate.
What a relief to discover the confusion when I did! If I’d only tested the numeric results in the vicinity of my new implementation, the missing deep copy might have stayed hidden for a few weeks more, at which point it would have been considerably harder to diagnose what the real intent and remedy were.
Our unit tests are extensive and lightweight enough that it’s easy to run them many times an hour and receive a useful reading on the overall health of the application. Unit tests are no guarantee of correctness, of course, but they can go a long way toward helping prevent me from slipping into a hazard I don’t even recognize.
2. Tests that aren’t executed
One of the worst failings of a unit test is not to be attempted at all. A test so expensive — in time, setup inconvenience or some related attribute — that it takes special effort to run is a test that invites neglect.
We all want automated tests. The real gain from automation is less about saving labor costs and more about consistency. Automated unit tests are more likely to be launched correctly, even after 6 p.m., or on national holidays that only apply in certain offices, or in the face of all the other complications that are trivial in isolation but add risk in real, commercial practice.
3. Tests that are ignored
Arguably worse than unexecuted unit tests are ignored ones. Too many projects have extensive test suites that can only be managed by an expert who knows all their intricacies: “We have to let timing skews pass, and also any complaints about spelling, and …” Aim for completely clean or green runs; anything else is simply too costly to keep on track.
Projects certainly need to establish priorities. Thousands of false positives can erupt at times, and a well-run project makes choices. Some of those choices need to be to keep the continuous testing working overall, and to ensure that the master or reference sources pass tests cleanly.
It’s far better to configure specific tests to be turned off explicitly, and temporarily, than to try to overlay a whole new error-prone folklore about which errors deserve attention. Expect unit tests to run to green, and keep them that way.
4. Tests that ensure uniform style
Was I serious about thousands of errors turning up at a time? Sure. One of the ways it happens is a team automating more of its stylistic choices.
For example, rather than leaving the tabs-vs.-spaces decision to moral persuasion, emotion-laden commit reviews or a separate quality assurance team, use an automatic tool. The first run might shock you with the number of divergences it detects. It’s worth working through them all, though, because such an automation permanently settles the question.
Do you want to see comment headers for method definitions? Should all programmers use the same indentation depth? Is there any reason to retain unreferenced variables? Resolve such questions permanently with appropriate tooling to keep the stylistic quality of your source high.
5. Tests that have low value
Tests that aren’t executed or are ignored don’t help. Some tests aren’t worth running, though; the trouble of maintaining them is greater than the information they provide. When some of your tests fall in this category, you improve your test suite by discarding them.
6. Tests that handle errors
Does your unit test suite exercise error paths? Building those out is an easy improvement in test quality for many applications.
Teams often emphasize happy paths in testing. A deficit in explicit tests for error-handling inevitably results in surprises when errors are examined with more care; it’s too easy otherwise for end-users to see messages like, “Error errno in function” or “Your message does not include the required [placeholder]”.
Error-handling is a requirement like any other requirement. When you write tests that fail, you teach your unit tests to treat error-handling seriously.
7. Tests that are too slow
Above I wrote about my affection for unit tests that are so quick, they invite frequent exercise. What if tests are just slow, though? What if modularization, optimization, advanced mocking and all your other efforts still yield tests that take hours to finish, rather than the seconds that otherwise fit your workflow?
Segment your tests into tiers. Keep the fast ones for interactive response, but create a framework that ensures more time-consuming tests are also run periodically. Don’t abandon expensive tests; just make a more suitable home for them.
8. Tests that reveal deceptions
Be realistic about unit tests. High-coverage scores, for instance, are simultaneously a worthy point of pride and no guarantee.
There are plenty of ways even a test suite that measures at 100% can mask incorrect code:
- A single line such as sum = first * second might be covered, but if it’s only tested with first = 2 and second = 2, its results are likely to be deceptive. Confusion of addition and multiplication might look like an obscure and unlikely error; the total of obscure errors is large, though.
- Tests for all values of two different variables don’t necessarily test all combinations of those two variables.
- Values might range more widely than you realize. Plenty of tests for Unicode, time zone behavior, color combinations and other specialized areas of computing turn out to cover only a subset of the possibilities, rather than the full ranges intended. Even these ranges, let alone the behaviors appropriate to them, might be specialized knowledge.
As valuable as unit tests are, they need help. Set yourself this assignment: If an error does make it through all your tests, and turns up first in a report from an end-user, how will your application and organization respond? Is that the response you want in such a situation? Put a system in place before that happens.
9. Tests that are correct but still not necessarily usable
An application that fulfills all its expressed requirements might still be low in usability, accessibility, or another quality no one considered when writing unit tests. This might have no easy solution; it helps, though, to be aware of the possibility.
10. Tests that predict dependency surprises
You start to code a tiny enhancement. Your first continuous testing report complains of an error in an utterly unrelated part of the application. How can that be?
The possibilities abound:
- That other part of the application might have a deeper connection to what you’re doing than you realized
- Perhaps your origin — the source before you made a change — was already corrupt somehow
- Something outside your control, and perhaps even outside your whole system, changed in an incompatible way
In all likelihood, your project is unique: Its combination of dependences on external libraries differs from that of any other application, anywhere. Even when all the libraries are of high quality and carefully tested before each individual release, it’s possible that a symptom might turn up in your testing that has never been seen by anyone else.
When this happens to you, be glad you found it early, during programming, rather than it first appearing in a customer encounter!
Computing is remarkably flexible, and testing computing constructs is correspondingly wide and deep. While the basic idea of unit tests is straightforward, to make the most of it demands great attention to detail and a willingness to adjust to a broad range of specific situations. Be on the lookout for these 10 circumstances so you know what to emulate and what to avoid.
Cameron Laird is an award-winning software developer and author. Cameron participates in several industry support and standards organizations, including voting membership in the Python Software Foundation. A long-time resident of the Texas Gulf Coast, Cameron’s favorite applications are for farm automation.