Code churn estimation using organisational and code metrics: An experimental comparison

  • Authors:
  • Siim Karus;Marlon Dumas

  • Affiliations:
  • Institute of Computer Science, University of Tartu, Tartu, Estonia;Institute of Computer Science, University of Tartu, Tartu, Estonia

  • Venue:
  • Information and Software Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Context: Source code revision control systems contain vast amounts of data that can be exploited for various purposes. For example, the data can be used as a base for estimating future code maintenance effort in order to plan software maintenance activities. Previous work has extensively studied the use of metrics extracted from object-oriented source code to estimate future coding effort. In comparison, the use of other types of metrics for this purpose has received significantly less attention. Objective: This paper applies machine learning techniques to unveil predictors of yearly cumulative code churn of software projects on the basis of metrics extracted from revision control systems. Method: The study is based on a collection of object-oriented code metrics, XML code metrics, and organisational metrics. Several models are constructed with different subsets of these metrics. The predictive power of these models is analysed based on a dataset extracted from eight open-source projects. Results: The study shows that a code churn estimation model built purely with organisational metrics is superior to one built purely with code metrics. However, a combined model provides the highest predictive power. Conclusion: The results suggest that code metrics in general, and XML metrics in particular, are complementary to organisational metrics for the purpose of estimating code churn.