I was introduced to mutation testing in my last job and I am very excited about its potential. Mutation testing evaluates how good a set of unit tests are. We used pitest and, applying it against an existing project, discovered a number of tests were not working as they should have been, despite providing code coverage. We also found a couple of minor bugs.
Mutation testing works by changing the bytecode for a piece of software then running the tests against this changed code. In theory, one of the tests providing coverage for that line ought to fail if the line changes. If this is not the case, then the code coverage is not actually asserting anything about that piece of code. A good introduction is a video by pitest’s creator, Henry Coles, Testing Like It’s 1971. (The title refers to the fact that mutation testing was invented in the 1970s but is only now achieving its potential).
I’d expected mutation testing to be painfully slow, but pitest can work through large code-bases surprisingly quickly. In smaller experiments, I found I could use pitest as part of the TDD cycle with little pain.
Working with mutation testing forces code coverage to be very high. It’s easy to exclude certain external calls, but all the other code within a project will need to be both covered and asserted. For some legacy codebases, adding such high coverage is going to be difficult. High coverage without TDD often produces brittle codebases that are hard to refactor, and adding tests retrospectively to these is expensive. Rather than using mutation testing for such codebases, it is probably more important to look initially at breaking down the code from large business logic classes (sometimes known as God classes) into smaller classes using the single responsibility principle.
But that’s another story. Whatever your situation, it’s worth looking into mutation testing, and thinking about how you can introduce it into your software build process.