An empirical evaluation of fault-proneness models
Proceedings of the 24th International Conference on Software Engineering
Comparative Assessment of Software Quality Classification Techniques: An Empirical Case Study
Empirical Software Engineering
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Building Defect Prediction Models in Practice
IEEE Software
Analyzing Software Quality with Limited Fault-Proneness Defect Data
HASE '05 Proceedings of the Ninth IEEE International Symposium on High-Assurance Systems Engineering
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
Empirical Analysis of Object-Oriented Design Metrics for Predicting High and Low Severity Faults
IEEE Transactions on Software Engineering
An Artificial Immune System Approach for Fault Prediction in Object-Oriented Software
DEPCOS-RELCOMEX '07 Proceedings of the 2nd International Conference on Dependability of Computer Systems
How to measure success of fault prediction models
Fourth international workshop on Software quality assurance: in conjunction with the 6th ESEC/FSE joint meeting
The Effects of Over and Under Sampling on Fault-prone Module Detection
ESEM '07 Proceedings of the First International Symposium on Empirical Software Engineering and Measurement
A Two-Step Model for Defect Density Estimation
EUROMICRO '07 Proceedings of the 33rd EUROMICRO Conference on Software Engineering and Advanced Applications
Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods
IEEE Transactions on Software Engineering
Comments on "Data Mining Static Code Attributes to Learn Defect Predictors"
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering
Data Mining Techniques for Building Fault-proneness Models in Telecom Java Software
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
Predicting defect-prone software modules using support vector machines
Journal of Systems and Software
Techniques for evaluating fault prediction models
Empirical Software Engineering
IEEE Transactions on Software Engineering
Predicting Fault Proneness of Classes Trough a Multiobjective Particle Swarm Optimization Algorithm
ICTAI '08 Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Data mining source code for locating software bugs: A case study in telecommunication industry
Expert Systems with Applications: An International Journal
Using pre & post-processing methods to improve binding site predictions
Pattern Recognition
IEEE Transactions on Knowledge and Data Engineering
Journal of Systems and Software
Early Software Fault Prediction Using Real Time Defect Data
ICMV '09 Proceedings of the 2009 Second International Conference on Machine Vision
A symbolic fault-prediction model based on multiobjective particle swarm optimization
Journal of Systems and Software
Evolutionary Optimization of Software Quality Modeling with Multiple Repositories
IEEE Transactions on Software Engineering
Research synthesis in software engineering: A tertiary study
Information and Software Technology
Effort-Aware Defect Prediction Models
CSMR '10 Proceedings of the 2010 14th European Conference on Software Maintenance and Reengineering
Failure is a four-letter word: a parody in empirical research
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
Empirical evaluation of the effects of mixed project data on learning defect predictors
Information and Software Technology
DConfusion: a technique to allow cross study performance evaluation of fault prediction studies
Automated Software Engineering
Hi-index | 0.00 |
There are many hundreds of fault prediction models published in the literature. The predictive performance of these models is often reported using a variety of different measures. Most performance measures are not directly comparable. This lack of comparability means that it is often difficult to evaluate the performance of one model against another. Our aim is to present an approach that allows other researchers and practitioners to transform many performance measures of categorical studies back into a confusion matrix. Once performance is expressed in a confusion matrix alternative preferred performance measures can then be derived. Our approach has enabled us to compare the performance of 600 models published in 42 studies. We demonstrate the application of our approach on several case studies, and discuss the advantages and implications of doing this.