Over the past week, I’ve been evaluating various test Code Coverage tools for our Java application. We have a large Java project with about 1500 production classes and 550 test classes. My goal was to find a tool that integrates well into our Ant test target, automatically producing html reports at the application, package, class, and line level without requiring code changes. The tool can’t be too slow. Good documentation and examples, historical reports, and an IntelliJ IDEA plugin were nice-to-have but not necessary. Cost is a consideration. I looked at ten free tools and one commercial tool during the evaluation. I found that some projects are dead, some have some work to do, and some have created excellent tools that should work for any team.
JavaRanch provides a nice overview of code coverage, explaining the goals of code coverage and demonstrating a few tools. I’ll rely on that and Google to provide my introduction to the subject. I will say that most successful tools work by instrumenting the bytecode to track which code is accessed. When a line of bytecode is executed, the instrumentation records the event, and those records are later processed into the reports. The instrumented bytecode is usually created in a separate directory which needs to be placed in the classpath for the JUnit task.
The best listing of free Java code coverage tools is at java-source.net. I considered each of the tools on that page and Clover, a commercial product. Here are my evaluations, from least suitable to most suitable.
InsECT is a dead project. Its Sourceforge activity is 0%, having not released anything since version 0.91 in 2003. I didn’t evaluate it.
JVDMI Code Coverage Analyser is also a dead project. Its Sourceforge activity is 26%, having not released anything since version 0.2 in 2002. I didn’t evaluate it.
Interesting approach, but not for us
Jester does not instrument the byte code to present a report, either. Instead it finds code that hasn’t been tested and literally changes the code. It changed branching conditions, hard-coded numbers, and String literals to see which tests break. While this is an interesting idea that did really well for Robert Martin and Robert Koss’s bowling scoring example, it is not what we are looking for.
Hansel does not instrument the byte code to present a report. Instead, it requires the developer to insert a CoverageDecorator in the suite() method of the test. This decorator will cause the test to fail if any of the code in the target class is not executed. While this approach would be very effective, it would not work for us. Since we have over 500 tests now, this change would require a lot of work to change the tests, and the sudden failure of a few thousand test cases would make our current tests irrelevant. This tool might be effective for a new project, but not for us.
Closer, but didn’t work for us
NoUnit is a promising project that could work well for some projects, but it didn’t quite meet our needs. The biggest problem is that it doesn’t come with an Ant task. While I did see on forums that people have written their own Ant tasks, I want that functionality to be built into the tool. Also, NoUnit reports to the method level, while most other tools go a step further to check lines within the methods.
Quilt is popular, but I can’t figure out why. Sourceforge activity is high, yet the latest version is from October 2003. Perhaps the project will improve significantly soon, but I was not impressed. The documentation was produced in Maven, so it looked good, but there was very little meat to it. In fact, I couldn’t find a single screenshot of the final report. More troublesome, I couldn’t get the tutorial to work with our application, and the example that came with the binary distribution did not work. There are enough good alternatives not to consider this tool.
jcoverage/gpl is the free version of a product that also has two different commercial versions. I was very frustrated with the site because it also didn’t show a screen shot or the cost of the non-free versions. The tutorial that came with the gpl version (the only one I tried) was minimal, but I thought it had enough to show me how to use the tool. Unfortunately, Ant couldn’t find all of their tasks despite my following their directions, so I never really got the tool to work. Since there are better free and commercial application with much better documentation, I didn’t press the issue.
JBlanket is an academic project that seems to have a similar approach to many other tools here. The screen shots in the documentation look great. I really wanted it to work, but there was a failure somewhere between JBlanket and JDOM. I had to move on.
Quilt, JBlanket, and jcoverage didn’t work for our application and the examples given; perhaps I would have solved the problems with more stubbornness, but the poor documentation did not encourage me to think that they would be better than Emma, which I had already evaluated.
GroboUtils would be a great product for a smaller product, but it needs much better performance before we could use it. Instrumenting our bytecode took almost twenty minutes on my machine, and the documentation implies that the instrumentation should happen on every compilation. Also, I got an OutOfMemory error during report generation. On the other hand, GroboUtils has nice functionality and examples, and I’m sure it would be a great solution for a smaller project.
Emma is a great free product. It started out as an internal tool, and it has been on Sourceforge since May 2004. Its instrumentation and report generation are very fast, taking less than two minutes each where GroboUtils took over ten minutes for each. The reports are very nice, although they don’t use pretty green and red bars like some of the other tools use. There are HTML, XML, and text reports, and the HTML report allows drill-down to the line. Emma uses an interesting approach to defining coverage that often results in lines being only partially covered. For example, lines with ternary operators with only one branch executed will show as partially executed, which is very nice. Emma is the only free tool that worked on our application and satisfied all my requirements. The examples and documentation were thorough, although I ran into a couple errors in the basic Ant tutorial. Emma does not provide automated historical reports or an IntelliJ plugin.
Clover is the best tool of the bunch. At its price ($2500 for the first five developers, $100 for each after that), it should be. Its instrumentation and report generation are very fast, similar to Emma. The HTML reports are even nicer than Emma’s, showing nice green and red bars to demonstrate coverage percentage in addition to the number. The HTML reports also allow drill-down to the line level, and they are designed with both frame and no-frame versions that will be familiar to anyone who knows Javadoc. There are also XML and PDF reports. Clover also provides an executive report and historical reports to show a high-level overview of coverage, and how the coverage has changed over time. Also, Clover integrates completely with IntelliJ and other IDEs.
The biggest problem I had with Clover may have been an artifact of evaluation; I couldn’t evaluate both the plugin and the Ant version at the same time because Clover instruments the actual bytecode instead of a copy, and the two installations of Clover fought over which had control. I hope that this problem goes away with the full, purchased version of Clover. Either that, or I hope that I will find a fix to the problem in a forum somewhere.
Emma and Clover were the best tools. Emma is fast, free, integrates with Ant well, and produces nice reports. Clover is fast, integrates with Ant and IntelliJ well, produces even nicer reports, and also has historical reports. However, it costs $2,500 to get the full set of features. Both have good documentation.
I do not know which we will start using; that will depend on getting budgetary approval for Clover. However, Emma is good enough that we’ll be very happy with it as a free alternative.
|Tool||License||sf.net Activity (Sep 14, 2004)||Advantages||Disadvantages|
reports, IDE integration, history reports, nice Ant
|Because it instruments the main
bytecode, it takes over; cost
|Emma||Common Public License v1.0||98.0256%||Very nice reports, fast||Have to create our own historical reports|
examples and documentation – good for small application
|Instrumentation and report generation are too slow for our large application|
|NoUnit||GPL||57.1157%||Nice report||No Ant task; measures method-level coverage exercised by JUnit tests only, not statement and branch coverage|
|Jester||Open License||56.5148%||Tests fragility of tests||It
doesn’t really test coverage
|Hansel||BSD License||44.0677%||Makes the tests themselves fail||Relies on changing all the suite() methods, not on instrumenting the byte code|
License and Artistic License
tutorial and documentation; couldn’t
get it working
get it working
get it working
|JVDMI Code Coverage Analyser||LGPL||26.703%||None||Inactive|