Introduction to the Special Issue on Mining Software Repositories in 2010
Empirical Software Engineering
SLuRp: a tool to help large complex systematic literature reviews deliver valid and rigorous results
Proceedings of the 2nd international workshop on Evidential assessment of software technologies
Questioning software maintenance metrics: a comparative case study
Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
A benchmarking-inspired approach to determine threshold values for metrics
ACM SIGSOFT Software Engineering Notes
An industrial study on the risk of software changes
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Empirical evaluation of the effects of mixed project data on learning defect predictors
Information and Software Technology
Beyond data mining; towards "idea engineering"
Proceedings of the 9th International Conference on Predictive Models in Software Engineering
A study of subgroup discovery approaches for defect prediction
Information and Software Technology
A study of cyclic dependencies on defect profile of software components
Journal of Systems and Software
An in-depth study of the potentially confounding effect of class size in fault prediction
ACM Transactions on Software Engineering and Methodology (TOSEM)
System performance analyses through object-oriented fault and coupling prisms
Proceedings of the 5th ACM/SPEC international conference on Performance engineering
ACM SIGSOFT Software Engineering Notes
Software defect prediction using relational association rule mining
Information Sciences: an International Journal
DConfusion: a technique to allow cross study performance evaluation of fault prediction studies
Automated Software Engineering
Hi-index | 0.00 |
Background: The accurate prediction of where faults are likely to occur in code can help direct test effort, reduce costs, and improve the quality of software. Objective: We investigate how the context of models, the independent variables used, and the modeling techniques applied influence the performance of fault prediction models. Method: We used a systematic literature review to identify 208 fault prediction studies published from January 2000 to December 2010. We synthesize the quantitative and qualitative results of 36 studies which report sufficient contextual and methodological information according to the criteria we develop and apply. Results: The models that perform well tend to be based on simple modeling techniques such as Naive Bayes or Logistic Regression. Combinations of independent variables have been used by models that perform well. Feature selection has been applied to these combinations when models are performing particularly well. Conclusion: The methodology used to build models seems to be influential to predictive performance. Although there are a set of fault prediction studies in which confidence is possible, more studies are needed that use a reliable methodology and which report their context, methodology, and performance comprehensively.