Benchmarking Classification Models for Software Defect Prediction: A Proposed Framework and Novel Findings

Authors:
Stefan Lessmann;Bart Baesens;Christophe Mues;Swantje Pietsch
Affiliations:
University of Hamburg, Hamburg;K.U.Leuven, Leuven;University of Southampton, Southampton;University of Hamburg, Hamburg
Venue:
IEEE Transactions on Software Engineering
Year:
2008

Citing 0
Cited 75

Modeling software evolution defects: a time series approach

Journal of Software Maintenance and Evolution: Research and Practice
Revisiting the evaluation of defect prediction models

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Practical considerations in deploying AI for defect prediction: a case study within the Turkish telecommunication industry

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
How to build repeatable experiments

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Misclassification cost-sensitive fault prediction models

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
On the relative value of cross-company and within-company data for defect prediction

Empirical Software Engineering
Feature Selection in Marketing Applications

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Reducing false alarms in software defect prediction by decision threshold optimization

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
A symbolic fault-prediction model based on multiobjective particle swarm optimization

Journal of Systems and Software
Comparing the effectiveness of several modeling methods for fault prediction

Empirical Software Engineering
Design-level metrics estimation based on code metrics

Proceedings of the 2010 ACM Symposium on Applied Computing
Variance analysis in software fault prediction models

ISSRE'09 Proceedings of the 20th IEEE international conference on software reliability engineering
Predicting the fix time of bugs

Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering
Evaluating logistic regression models to estimate software project outcomes

Information and Software Technology
Defect prediction from static code features: current results, limitations, new approaches

Automated Software Engineering
Practical considerations in deploying statistical methods for defect prediction: A case study within the Turkish telecommunications industry

Information and Software Technology
Replication of defect prediction studies: problems, pitfalls and recommendations

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
On the value of learning from defect dense components for software defect prediction

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
An analysis of developer metrics for fault prediction

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Programmer-based fault prediction

Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Identifying financially successful start-up profiles with data mining

Expert Systems with Applications: An International Journal
An ant colony optimization algorithm to improve software quality prediction models: Case of class stability

Information and Software Technology
Review: On the application of genetic programming for software engineering predictive modeling: A systematic review

Expert Systems with Applications: An International Journal
Different strokes for different folks: a case study on software metrics for different defect categories

Proceedings of the 2nd International Workshop on Emerging Trends in Software Metrics
A simpler model of software readability

Proceedings of the 8th Working Conference on Mining Software Repositories
Comparing fine-grained source code changes and code churn for bug prediction

Proceedings of the 8th Working Conference on Mining Software Repositories
Defect prediction using social network analysis on issue repositories

Proceedings of the 2011 International Conference on Software and Systems Process
Software defect detection with rocus

Journal of Computer Science and Technology
Tuning metaheuristics: A data mining based approach for particle swarm optimization

Expert Systems with Applications: An International Journal
An industrial case study of classifier ensembles for locating software defects

Software Quality Control
Empirical validation of human factors in predicting issue lead time in open source projects

Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Using the gini coefficient for bug prediction in eclipse

Proceedings of the 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution
An experimental comparison of classification algorithms for imbalanced credit scoring data sets

Expert Systems with Applications: An International Journal
Predicting high-risk program modules by selecting the right software measurements

Software Quality Control
Applying the Mahalanobis-Taguchi strategy for software defect diagnosis

Automated Software Engineering
Sample-based software defect prediction with active and semi-supervised learning

Automated Software Engineering
Guest editorial: learning to organize testing

Automated Software Engineering
An investigation on the feasibility of cross-project defect prediction

Automated Software Engineering
User preferences based software defect detection algorithms selection using MCDM

Information Sciences: an International Journal
Searching for rules to detect defective modules: A subgroup discovery approach

Information Sciences: an International Journal
An evolutionary programming based asymmetric weighted least squares support vector machine ensemble learning methodology for software repository mining

Information Sciences: an International Journal
Ecological inference in empirical software engineering

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Evaluating defect prediction approaches: a benchmark and an extensive comparison

Empirical Software Engineering
Time variance and defect prediction in software projects

Empirical Software Engineering
Privacy and utility for defect prediction: experiments with MORPH

Proceedings of the 34th International Conference on Software Engineering
Bug prediction based on fine-grained module histories

Proceedings of the 34th International Conference on Software Engineering
Mining input sanitization patterns for predicting SQL injection and cross site scripting vulnerabilities

Proceedings of the 34th International Conference on Software Engineering
On the use of data filtering techniques for credit risk prediction with instance-based models

Expert Systems with Applications: An International Journal
Automatic query performance assessment during the retrieval of software artifacts

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Predicting common web application vulnerabilities from input validation and sanitization code patterns

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Factors characterizing reopened issues: a case study

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
An adaptive approach with active learning in software fault prediction

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Comparing the performance of fault prediction models which report multiple performance measures: recomputing the confusion matrix

Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Method-level bug prediction

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Recalling the "imprecision" of cross-project defect prediction

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Modular construction of an analysis tool for mining software repositories

Proceedings of the 12th annual international conference companion on Aspect-oriented software development
Influence of confirmation biases of developers on software quality: an empirical study

Software Quality Control
Empirical evaluation of the effects of mixed project data on learning defect predictors

Information and Software Technology
An approach for web service discoverability anti-pattern detection for journal of web engineering

Journal of Web Engineering
Software effort models should be assessed via leave-one-out validation

Journal of Systems and Software
How, and why, process metrics are better

Proceedings of the 2013 International Conference on Software Engineering
Mining SQL injection and cross site scripting vulnerabilities using hybrid program analysis

Proceedings of the 2013 International Conference on Software Engineering
Better cross company defect prediction

Proceedings of the 10th Working Conference on Mining Software Repositories
Using code change types in an analogy-based classifier for short-term defect prediction

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
An algorithmic approach to missing data problem in modeling human aspects in software development

Proceedings of the 9th International Conference on Predictive Models in Software Engineering
Predicting SQL injection and cross site scripting vulnerabilities through mining input sanitization patterns

Information and Software Technology
A study of subgroup discovery approaches for defect prediction

Information and Software Technology
Empirical studies on feature selection for software fault prediction

Proceedings of the 5th Asia-Pacific Symposium on Internetware
An in-depth study of the potentially confounding effect of class size in fault prediction

ACM Transactions on Software Engineering and Methodology (TOSEM)
Software quality assessment using a multi-strategy classifier

Information Sciences: an International Journal
Automatic detection and correction of web application vulnerabilities using data mining to predict false positives

Proceedings of the 23rd international conference on World wide web
Software defect prediction using Bayesian networks

Empirical Software Engineering
Prediction of faults-slip-through in large software projects: an empirical evaluation

Software Quality Control
DConfusion: a technique to allow cross study performance evaluation of fault prediction studies

Automated Software Engineering
Explaining data-driven document classifications

MIS Quarterly

Quantified Score

Hi-index	0.01

Visualization

Abstract

Software defect prediction strives to improve software quality and testing efficiency by constructing predictive classification models from code attributes to enable a timely identification of fault-prone modules. Several classification models have been evaluated for this task. However, due to inconsistent findings regarding the superiority of one classifier over another and the usefulness of metric-based classification in general, more research is needed to improve convergence across studies and further advance confidence in experimental results. We consider three potential sources for bias: comparing classifiers over one or a small number of proprietary datasets, relying on accuracy indicators that are conceptually inappropriate for software defect prediction and cross-study comparisons, and finally, limited use of statisti-cal testing procedures to secure empirical findings. To remedy these problems, a framework for comparative software defect prediction experiments is proposed and applied in a large-scale empirical comparison of 22 classifiers over ten public domain datasets from the NASA Metrics Data repository. Our results indicate that the importance of the particu-lar classification algorithm may have been overestimated in previous research since no significant performance differ-ences could be detected among the top-17 classifiers.