It's not a bug, it's a feature: how misclassification impacts bug prediction

  • Authors:
  • Kim Herzig;Sascha Just;Andreas Zeller

  • Affiliations:
  • Saarland University, Germany;Saarland University, Germany;Saarland University, Germany

  • Venue:
  • Proceedings of the 2013 International Conference on Software Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a manual examination of more than 7,000 issue reports from the bug databases of five open-source projects, we found 33.8% of all bug reports to be misclassified---that is, rather than referring to a code fix, they resulted in a new feature, an update to documentation, or an internal refactoring. This misclassification introduces bias in bug prediction models, confusing bugs and features: On average, 39% of files marked as defective actually never had a bug. We discuss the impact of this misclassification on earlier studies and recommend manual data validation for future studies.