On guiding the augmentation of an automated test suite via mutation analysis

  • Authors:
  • Ben H. Smith;Laurie Williams

  • Affiliations:
  • Department of Computer Science, North Carolina State University, Raleigh, USA 27695-8206;Department of Computer Science, North Carolina State University, Raleigh, USA 27695-8206

  • Venue:
  • Empirical Software Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mutation testing has traditionally been used as a defect injection technique to assess the effectiveness of a test suite as represented by a "mutation score." Recently, mutation testing tools have become more efficient, and industrial usage of mutation analysis is experiencing growth. Mutation analysis entails adding or modifying test cases until the test suite is sufficient to detect as many mutants as possible and the mutation score is satisfactory. The augmented test suite resulting from mutation analysis may reveal latent faults and provides a stronger test suite to detect future errors which might be injected. Software engineers often look for guidance on how to augment their test suite using information provided by line and/or branch coverage tools. As the use of mutation analysis grows, software engineers will want to know how the emerging technique compares with and/or complements coverage analysis for guiding the augmentation of an automated test suite. Additionally, software engineers can benefit from an enhanced understanding of efficient mutation analysis techniques. To address these needs for additional information about mutation analysis, we conducted an empirical study of the use of mutation analysis on two open source projects. Our results indicate that a focused effort on increasing mutation score leads to a corresponding increase in line and branch coverage to the point that line coverage, branch coverage and mutation score reach a maximum but leave some types of code structures uncovered. Mutation analysis guides the creation of additional "common programmer error" tests beyond those written to increase line and branch coverage. We also found that 74% of our chosen set of mutation operators is useful, on average, for producing new tests. The remaining 26% of mutation operators did not produce new test cases because their mutants were immediately detected by the initial test suite, indirectly detected by test suites we added to detect other mutants, or were not able to be detected by any test.