Predicting Fault-Prone Software Modules in Telephone Switches
IEEE Transactions on Software Engineering
Predicting Fault Incidence Using Software Change History
IEEE Transactions on Software Engineering
Use of relative code churn measures to predict system defect density
Proceedings of the 27th international conference on Software engineering
Predicting the Location and Number of Faults in Large Software Systems
IEEE Transactions on Software Engineering
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study
ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
Predicting defects using network analysis on dependency graphs
Proceedings of the 30th international conference on Software engineering
Implications of ceiling effects in defect predictors
Proceedings of the 4th international workshop on Predictor models in software engineering
Empirical Software Engineering
Comparing the effectiveness of several modeling methods for fault prediction
Empirical Software Engineering
Programmer-based fault prediction
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Change Bursts as Defect Predictors
ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
Can file level characteristics help identify system level fault-proneness?
HVC'11 Proceedings of the 7th international Haifa Verification conference on Hardware and Software: verification and testing
Hi-index | 0.01 |
Background: Several studies have examined code churn as a variable for predicting faults in large software systems. High churn is usually associated with more faults appearing in code that has been changed frequently. Aims: We investigate the extent to which faults can be predicted by the degree of churn alone, whether other code characteristics occur together with churn, and which combinations of churn and other characteristics provide the best predictions. We also investigate different types of churn, including both additions to and deletions from code, as well as overall amount of change to code. Method: We have mined the version control database of a large software system to collect churn and other software measures from 18 successive releases of the system. We examine the frequency of faults plotted against various code characteristics, and evaluate a diverse set of prediction models based on many different combinations of independent variables, including both absolute and relative churn. Results: Churn measures based on counts of lines added, deleted, and modified are very effective for fault prediction. Individually, counts of adds and modifications outperform counts of deletes, while the sum of all three counts was most effective. However, these counts did not improve prediction accuracy relative to a model that included a simple count of the number of times that a file had been changed in the prior release. Conclusions: Including a measure of change in the prior release is an essential component of our fault prediction method. Various measures seem to work roughly equivalently.