A Systematic Literature Review on Fault Prediction Performance in Software Engineering

Authors:
Tracy Hall;Sarah Beecham;David Bowes;David Gray;Steve Counsell
Affiliations:
Brunel University, Uxbridge;University of Limerick, Limerick;University of Hertfordshire, Hatfield;University of Hertfordshire, Hatfield;Brunel University, Uxbridge
Venue:
IEEE Transactions on Software Engineering
Year:
2012

Citing 0
Cited 14

Introduction to the Special Issue on Mining Software Repositories in 2010

Empirical Software Engineering
SLuRp: a tool to help large complex systematic literature reviews deliver valid and rigorous results

Proceedings of the 2nd international workshop on Evidential assessment of software technologies
Questioning software maintenance metrics: a comparative case study

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
A benchmarking-inspired approach to determine threshold values for metrics

ACM SIGSOFT Software Engineering Notes
An industrial study on the risk of software changes

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Empirical evaluation of the effects of mixed project data on learning defect predictors

Information and Software Technology
Beyond data mining; towards "idea engineering"

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
A study of subgroup discovery approaches for defect prediction

Information and Software Technology
A study of cyclic dependencies on defect profile of software components

Journal of Systems and Software
An in-depth study of the potentially confounding effect of class size in fault prediction

ACM Transactions on Software Engineering and Methodology (TOSEM)
System performance analyses through object-oriented fault and coupling prisms

Proceedings of the 5th ACM/SPEC international conference on Performance engineering
Technical debt at the crossroads of research and practice: report on the fifth international workshop on managing technical debt

ACM SIGSOFT Software Engineering Notes
Software defect prediction using relational association rule mining

Information Sciences: an International Journal
DConfusion: a technique to allow cross study performance evaluation of fault prediction studies

Automated Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Background: The accurate prediction of where faults are likely to occur in code can help direct test effort, reduce costs, and improve the quality of software. Objective: We investigate how the context of models, the independent variables used, and the modeling techniques applied influence the performance of fault prediction models. Method: We used a systematic literature review to identify 208 fault prediction studies published from January 2000 to December 2010. We synthesize the quantitative and qualitative results of 36 studies which report sufficient contextual and methodological information according to the criteria we develop and apply. Results: The models that perform well tend to be based on simple modeling techniques such as Naive Bayes or Logistic Regression. Combinations of independent variables have been used by models that perform well. Feature selection has been applied to these combinations when models are performing particularly well. Conclusion: The methodology used to build models seems to be influential to predictive performance. Although there are a set of fault prediction studies in which confidence is possible, more studies are needed that use a reliable methodology and which report their context, methodology, and performance comprehensively.