Does measuring code change improve fault prediction?

Authors:
Robert M. Bell;Thomas J. Ostrand;Elaine J. Weyuker
Affiliations:
AT&T Labs - Research, Florham Park, NJ;AT&T Labs - Research, Florham Park, NJ;AT&T Labs - Research, Florham Park, NJ
Venue:
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Year:
2011

Citing 12
Cited 1

Predicting Fault-Prone Software Modules in Telephone Switches

IEEE Transactions on Software Engineering
Predicting Fault Incidence Using Software Change History

IEEE Transactions on Software Engineering
Use of relative code churn measures to predict system defect density

Proceedings of the 27th international conference on Software engineering
Predicting the Location and Number of Faults in Large Software Systems

IEEE Transactions on Software Engineering
Data Mining Static Code Attributes to Learn Defect Predictors

IEEE Transactions on Software Engineering
Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study

ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
Predicting defects using network analysis on dependency graphs

Proceedings of the 30th international conference on Software engineering
Implications of ceiling effects in defect predictors

Proceedings of the 4th international workshop on Predictor models in software engineering
Do too many cooks spoil the broth? Using the number of developers to enhance defect prediction models

Empirical Software Engineering
Comparing the effectiveness of several modeling methods for fault prediction

Empirical Software Engineering
Programmer-based fault prediction

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Change Bursts as Defect Predictors

ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering

Can file level characteristics help identify system level fault-proneness?

HVC'11 Proceedings of the 7th international Haifa Verification conference on Hardware and Software: verification and testing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Background: Several studies have examined code churn as a variable for predicting faults in large software systems. High churn is usually associated with more faults appearing in code that has been changed frequently. Aims: We investigate the extent to which faults can be predicted by the degree of churn alone, whether other code characteristics occur together with churn, and which combinations of churn and other characteristics provide the best predictions. We also investigate different types of churn, including both additions to and deletions from code, as well as overall amount of change to code. Method: We have mined the version control database of a large software system to collect churn and other software measures from 18 successive releases of the system. We examine the frequency of faults plotted against various code characteristics, and evaluate a diverse set of prediction models based on many different combinations of independent variables, including both absolute and relative churn. Results: Churn measures based on counts of lines added, deleted, and modified are very effective for fault prediction. Individually, counts of adds and modifications outperform counts of deletes, while the sum of all three counts was most effective. However, these counts did not improve prediction accuracy relative to a model that included a simple count of the number of times that a file had been changed in the prior release. Conclusions: Including a measure of change in the prior release is an essential component of our fault prediction method. Various measures seem to work roughly equivalently.