Comparing fine-grained source code changes and code churn for bug prediction

  • Authors:
  • Emanuel Giger;Martin Pinzger;Harald C. Gall

  • Affiliations:
  • University of Zurich, Switzerland, Switzerland;Delft University of Technology, Netherlands, Netherlands;University of Zurich, Switzerland, Switzerland

  • Venue:
  • Proceedings of the 8th Working Conference on Mining Software Repositories
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A significant amount of research effort has been dedicated to learning prediction models that allow project managers to efficiently allocate resources to those parts of a software system that most likely are bug-prone and therefore critical. Prominent measures for building bug prediction models are product measures, e.g., complexity or process measures, such as code churn. Code churn in terms of lines modified (LM) and past changes turned out to be significant indicators of bugs. However, these measures are rather imprecise and do not reflect all the detailed changes of particular source code entities during maintenance activities. In this paper, we explore the advantage of using fine-grained source code changes (SCC) for bug prediction. SCC captures the exact code changes and their semantics down to statement level. We present a series of experiments using different machine learning algorithms with a dataset from the Eclipse platform to empirically evaluate the performance of SCC and LM. The results show that SCC outperforms LM for learning bug prediction models.