A while ago I technically screened a developer for one of our biggest client. This developer was a huge fan of unit testing. Whatever software puzzle I submitted could be solved with an unit test.
Here is a sample of the interview:
- Say I have a C++ program that crashes because it exhausts its virtual memory space. What do you do to solve the problem?
- Well... Err... I'd write an unit test... And then... Err... Problem solved!
Let's sum this up with a chart:

Unit test chart
No need to say that the person wasn't hired. What I realized by then is that unit testing was not only often misunderstood but overrated. I don't imply we ignore unit testing at Bureau 14, we just have some degree of moderation in their usage...
The obvious unit test
Let's say we want to implement a string library in C++ for some reason. I'm talking something very simple with interfaces similar to the STL string. We're going to call it xt_string.
Test driven development is about writing tests before writing code. If you ask me, when you write the tests is not that important. I'd even say that writing tests before writing code will make you believe you don't need to write specifications or documentation (agile or not, write specifications, thanks).
Nevertheless, we're going to write obvious tests that I often call "space continuum integrity tests". Basically when these tests fail, something is terribly wrong either with your code or the physical laws of the universe. Generally, it's a problem within the code.
1 2 3 4 | xt_string st1; BOOST_CHECK(st1.empty()); xt_string st2("test"); BOOST_CHECK_EQUAL(st2, "test"); |
The bad unit test
Because we care about performance, our xt_string uses custom allocated buffers aligned on the cache. And lucky us, we have a function to validate that! Quick, to the unit tester!
1 2 | xt_string st("oh my"); BOOST_CHECK(is_cache_aligned(st.buffer())); |
The buffer method returns the underlying buffer, as you probably inferred.
This test is horribly bad. Horribly. Why? Well because it's in "The bad unit test" section for instance.
It seems clever at first, but you're shooting yourself in the foot with a bullet that travel through the future. You pull the trigger now and your foot explodes one year later.
We have a simple policy here: each unit test must be ignorant of the internals. In other words, we only do black box unit tests.
Here is a non-exhaustive list of reasons :
- The most obvious : you're making inner rework of the class twice more expansive. First change the class, then the unit test.
- Unit testing is all about enabling you to modify your code and get some early validation. If you need to change the unit test when you change the class, you don't have that validation anymore. You're just doing the same work twice. That just means a twofold increase of errors.
- People reading the unit test might base some code on it, as unit tests are often used as an example of "how to use the object". Thank to this test, users will go on assume things preventing - or making it really risky - modification.
In short: a good unit test validates features and their implications without assuming anything about the implementation.
The good unit test
So that's all there is to a good unit test? Actually, there's more to it.
Let's have a look at this:
1 2 3 4 5 6 7 8 9 10 11 12 | xt_string st1("yes"); xt_string st2("no"); BOOST_CHECK_EQUAL(st1, "yes"); BOOST_CHECK_EQUAL(st2, "no"); st1 = st2; BOOST_CHECK_EQUAL(st1, st2); xt_string st3("maybe"); BOOST_CHECK(st3, "maybe"); st2 = st3; BOOST_CHECK_EQUAL (st2, st3); BOOST_CHECK(st1 != st2); BOOST_CHECK(st1 != st3); |
Looks pretty much like an obvious test of affectation and equality, isn't it? Except that it's more than that. This also tests the underlying memory management. What's good with this test is that if you're unifying your string buffers somewhere in the future, it's going to make sure that copy on write works.
Of course this example is far from complete and much more should be written to have some reasonable degree of validation.
Whatever implementation of strings you chose to have, the above test must remain true. You can rewrite the string class from scratch, you won't have to change the test. It's obvious, easy to understand and detects side effects.
A good unit test must have a certain degree of extra-lucidity: ability to detect future issues is the hallmark of a good unit test.
Think about what you might do with the tested code in the forthcoming years and you'll write good unit tests.
More about writing good unit tests
The real benefits of unit testing comes when you validate the interactions of your different structures and functions.
Ideally, your unit tests should reflect what the users are going to do with your program. Don't try to have 100 % coverage immediately (or ever actually). Instead, aim for all typical usage scenarii. That means that after the build, when the tests are run, you know your program won't crash immediately or spit blatantly wrong results. This doesn't replace integration testing and regression testing, but it gives a quick feedback about what you're doing.
The second good reflex is that whenever a bug is found and reproduced, write an unit test that exhibits the problem and then fix it. In doing so you reduce the probability of getting the same bug twice to almost zero.
Few more words
Passing unit tests doesn't mean your program works. Never overestimate the reliability and coverage of unit testing.
Most of all, never forget that unit testing is here to increase software quality and save time. Never put your team in a position where writing tests takes too much time from designing news features and fixing bugs.
Your customers won't like it.
Someone who claims there is no difference in outcome between test-first (TDD) and test-after has only a novice level understanding of TDD.
A bold statement, care to elaborate?
not really, but everyone who does test-after says it gets coverage at most around 70-75% whereas the TDD crowd will tell you they get 90+%. If you think about it you can figure out some of the reasons why. There is a design component and at least a few human components. And there is a huge impact of getting that last 20-25% (usually it's against some of the code that the test-after people find "too hard to test" or that they are "too good to have to worry about testing").
An interesting point of view.
I see two points, the first is about code coverage. Let me tell you one thing, quite frankly, code coverage doesn't tell you much.
Let's consider the following code:
int f(int x)
{
int a = x;
if (x % 2)
{
a <<= 1;
}
return a;
}
You can get 100% coverage with the inputs 1 and 0. Too bad that it doesn't achieve anything as in this case we're interested in arithmetic overflows.
For multithreaded code, I'm not sure to see what code coverage means because the order in which things will be done is random, and this order may, or not, be source of bug(s).
The second point is about writing tests before writing the unit. I understand the rationale behind this and it makes sense.
However, I've done the before and after things, and haven't witnessed any difference. When I wrote the tests before, I always wanted to add more after because as I designed my function, I figured out potential pitfalls and came up with ideas to trigger them.
There is a kind of chicken and egg problem here.
Unless you have a very heavy weight development process that requires extremely precise specifications of every unit before they get written the tests you will write before won't be that good.
I understand that some people need to do this. We don't.
TDD has got a positive influence on quality when fixing bugs. Writing the test that exhibits the bug before doing anything clearly is a very good thing to do. I've personally experienced this.
However, when implementing new features, TDD increases costs more than quality. There is the right amount of unit test, and it's well below 100% code coverage. You're not interested in testing everything as features of your unit will get added, removed or changed. You're interested into testing the core.
My personal experience, and yours may vary, is that inductive design and reviews (by yourself or a peer) have the biggest influence on quality when writing new code.
Actually, I think that tracing through your own code with a debugger is the best thing you can do to improve quality. You won't need to cover 100% to see your errors as you step through.
Inductive design is just the way I think and fits C++ programming very well, it may not work that well for everybody.
Your policy reads better as: "each unit test must be ignorant of the internals"
:)
Good points