Code Style:

Unit Testing

Have you ever written code, started writing tests to make sure the code works correctly, and then become frustrated with how tedious testing can be? If so, this tutorial is for you.

It's easy to overlook the importance of testing. Testing is a key part of the Development Lifecycle. In fact, it plays a key role at several points during product development and maintenance. In the real world, most entry-level jobs in big software companies are in test, because testing is a chance to let new hires work in any area of the company... every part of the product needs more testing!

In this tutorial, we'll start with some broken code, and we'll write some tests for it. Then we'll start using JUnit to automate the process. We will also introduce (informally) some elements of test-driven design.

When writing tests, a good philosophy is that all code is buggy, and that the tester's job is to find the bugs. This philosophy isn't always true, since some software is provably bug free (for example, the seL4 kernel). But it's a good philosophy to have when testing.

Let's begin with some bad code. This code is bad for many reasons. First, it's implementing an unordered array-based collection in Java, without offering any advantage over ArrayList. Second, it's bad because the implementation isn't correct. There are bugs. And we'll use testing to find the bugs.

This is obviously bad code. It can't reclaim space after something has been deleted. It can't handle more than 16 elements in the collection. And it silently fails sometimes.

Unfortunately, when we have to write code to test if something is correct, we tend to do functionality testing. For example, you might write code like this:

This seems good, right? You can compile both files, run the driver, and the program seems to be doing exactly what it should. But it's tedious to start writing all the tests that it takes to really know that the code is correct. Let's try doing some tests by hand.

Let's start by trying to test if the size of the array is correct after lots of insertions:

This code prints an error, which is good. But that was tedious, and it could be even more tedious. Consider when we try to do an illegal removal:

The program crashes due to an exception. That's good, but all that throwing and catching is annoying. And it's a bit unrealistic to generate output for every possible test. We really want to just (a) trust that all of our tests run, and (b) get a succinct report of what went wrong.

To get started with JUnit, you're going to need two jars: junit-4.12.jar and hamcrest-core-1.3.jar. This would be a great time to set up a reasonable development environment. I created a folder called src, and put my java files into it. I then created a folder called lib, and put the two jar files into it. Finally, I made this build.xml file:

So far, that didn't do anything new or impressive. Let's make a JUnit test:

If you re-run ant, this should build too. Now let's add a target to our build.xml file, so that we can test our code:

Running "ant test" is all it should take to get a summary of how our program performed.

Now it's time to start writing useful tests. There are a two things to note. First, every method that is annotated with @Test will run as part of our tests. Second, we can use static imports to gain easy access to the assertion features of JUnit.

To make our tutorial easier, let's use a static import to get all of the assertion functionality into our test code:

import static org.junit.Assert.*;

And now let's use the most simple assertion:

Note that the assertEquals method uses the equals() method of its operands.

When we run our tests, you'll see that we get another failure. There are lots of different assertions that can be useful. Let's explore a few more. First, let's see what happens when our code unexpectedly breaks:

When we run this test, the report indicates that a test produced an error. On the bright side, the suite of tests did not crash. But this error was a good thing... it's what we expected. Let's transform the error into a test that passes by telling JUnit that we expected an error. There are two ways we can do this. The second of them is much cleaner:

Note that in both cases, we ought to be explicit about the exceptions we expect. A null pointer exception should still be an error. Of course in the second approach, since we are telling JUnit about the exceptions we expect, it's easier... we're not tempted to catch all exceptions in one catch block.

Briefly, here are a few more assertions in JUnit. We can assert the equality of arrays, we can assert that booleans are true or false, we can assert that things are or are not null, and we can test sameness (i.e., not just equal, but actually pointing to the same objects). Let's try them all out. Here are a few array tests:

Here's a rather primitive example of a boolean test. First, let's extend our collection with a method for testing emptiness:

Now we can perform assertions directly on the return values of the empty method:

Likewise, we can make assertions about whether a value is or is not null.

Finally, let's look at the idea of sameness. Here's a rather trivial example:

Here, we can see that equality isn't enough... the parameters must refer to the same object, or the test fails.

In our previous examples, we ran ant test to create a quick report of how many tests succeeded and failed. But we didn't get any information about which tests failed, or why. JUnit can produce this information for us, quite easily. Better yet, we can do it all via a few changes to our build script:

We specify a "formatter", which describes the output format (you might want to check out 'xml'), and then we add a todir attribute that indicates where the output file ought to go. The output is quite verbose and helpful:

Note that our tests did not run in the order we specified, but they did all run, and whenever a test failed or produced an error, we received lots of information about why. From here, it would be straightforward to diagnose errors and fix the code.

One challenge faced when unit testing is that you want to run your tests in a controlled and local environment... not on production systems. But if you're on a local system, running your tests in any order, then how do you simulate the real world? Similarly, what if you want to test the interaction between a completed component and an incomplete component?

The most common approach, as far as I can tell, is to refactor your code so that testing is easier. This can be a pain in the neck, though, especially if it means creating holes in your API.

As a concrete example, suppose we had a function that is supposed to take a file, and produce an array of strings:

(Note: you should be able to make an appropriate target for this in your build.xml). Given a proper "data.csv", everything should work fine. But let's refactor a bit, to make it easier to test. Our refactor is going to be really simple. Let's separate the use of files from the rest of the interface:

Now that we've made that change, we can test the unit without needing any files. We should, of course, test the file handling code separately, but we won't worry about that for now. Note that this change also makes the code more robust.

Here's the test we'll run:

You'll notice that I tried to play around a little bit with the use of newlines. I should write much more robust tests. For example, what if a line starts with a comma? What if it doesn't end with a comma? What about content that includes spaces? Tabs? Does the trim function always do what I expect?

Another interesting issue is that I put the test code in the same package as the code being tested. That enabled me to have access to the protected content field, without having to worry about inheritance.

The good thing about this code is that we can do all of those tests without needing multiple files. We can just provide different byte arrays.

There are two ways in which testing can impact the way that you design and develop code. The first is that there are development patterns that can make your code harder to test. Here are two important issues:

It's not always easy to write code that avoids these problems. But remember: the most common place for subtle bugs is in the code that connects components. It's important to invest time in these areas.

The other important issue is that a test-first development strategy is often a good idea. Knowing the tests is a good way to be sure that you've specified a component well. Note that you can't design tests until you've designed the thing to be tested, so if the first code you plan to write is test code, then you've guaranteed that you didn't start writing code before you did (at least some) design. For many people, just that amount of discipline is a good start.

When you design tests, you're forced to think about the corner cases in the interface between components. Is the caller or callee responsible for checking for null? Who validates formats? How many overloads of a method are appropriate? These questions are much easier when you think about them early.

Lastly, test-driven design helps you to be productive. If you started a project by designing 100 test cases, then any time you set down to work on the project, you can pick a failing case, write the corresponding code, and know that you made progress. Contrast "I wrote 100 lines of code" with "I finished two more test cases, only 18 to go". The former is amorphous, and you don't have any more sense for where your project is going. The latter statement immediately reveals your status. When you've got just a few minutes to spend, having a list of unsolved tests focuses you on bite-sized pieces of work that you can resolve in the time you have.

Here are a few additional tasks to try: