A Critique of Software Defect Prediction Models
IEEE Transactions on Software Engineering
Machine Learning
Classification of Fault-Prone Software Modules: Prior Probabilities,Costs, and Model Evaluation
Empirical Software Engineering
Cost-Benefit Analysis of Software Quality Models
Software Quality Control
Robust Prediction of Fault-Proneness by Random Forests
ISSRE '04 Proceedings of the 15th International Symposium on Software Reliability Engineering
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
A Replicated Quantitative Analysis of Fault Distributions in Complex Software Systems
IEEE Transactions on Software Engineering
A Multivariate Analysis of Static Code Attributes for Defect Prediction
QSIC '07 Proceedings of the Seventh International Conference on Quality Software
Fault Prediction using Early Lifecycle Data
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
The influence of organizational structure on software quality: an empirical case study
Proceedings of the 30th international conference on Software engineering
Comparing design and code metrics for software quality prediction
Proceedings of the 4th international workshop on Predictor models in software engineering
A critical analysis of variants of the AUC
Machine Learning
Cross-validation and bootstrapping are unreliable in small sample classification
Pattern Recognition Letters
An empirical investigation of tree ensembles in biometrics and bioinformatics research
An empirical investigation of tree ensembles in biometrics and bioinformatics research
Techniques for evaluating fault prediction models
Empirical Software Engineering
IEEE Transactions on Software Engineering
Cost Curve Evaluation of Fault Prediction Models
ISSRE '08 Proceedings of the 2008 19th International Symposium on Software Reliability Engineering
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Hi-index | 0.00 |
Software fault prediction models play an important role in software quality assurance. They identify software subsystems (modules, components, classes, or files) which are likely to contain faults. These subsystems, in turn, receive additional resources for verification and validation activities. Fault prediction models are binary classifiers typically developed using one of the supervised learning techniques from either a subset of the fault data from the current project or from a similar past project. In practice, it is critical that such models provide a reliable prediction performance on the data not used in training. Variance is an important reliability indicator of software fault prediction models. However, variance is often ignored or barely mentioned in many published studies. In this paper, through the analysis of twelve data sets from a public software engineering repository from the perspective of variance, we explore the following five questions regarding fault prediction models: (1) Do different types of classification performance measures exhibit different variance? (2) Does the size of the data set imply a more (or less) accurate prediction performance? (3) Does the size of training subset impact model's stability? (4) Do different classifiers consistently exhibit different performance in terms of model's variance? (5) Are there differences between variance from 1000 runs and 10 runs of 10-fold cross validation experiments? Our results indicate that variance is a very important factor in understanding fault prediction models and we recommend the best practice for reporting variance in empirical software engineering studies.