Machine Learning
Predicting Fault-Prone Software Modules in Telephone Switches
IEEE Transactions on Software Engineering
Comparing case-based reasoning classifiers for predicting high risk software components
Journal of Systems and Software
Elements of Software Science (Operating and programming systems series)
Elements of Software Science (Operating and programming systems series)
Machine Learning
Ordering Fault-Prone Software Modules
Software Quality Control
Predicting the Location and Number of Faults in Large Software Systems
IEEE Transactions on Software Engineering
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
ROCR: visualizing classifier performance in R
Bioinformatics
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Data Mining Static Code Attributes to Learn Defect Predictors
IEEE Transactions on Software Engineering
Adequate and Precise Evaluation of Quality Models in Software Engineering Studies
PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
Automating algorithms for the identification of fault-prone files
Proceedings of the 2007 international symposium on Software testing and analysis
IEEE Transactions on Software Engineering
Comments on "Data Mining Static Code Attributes to Learn Defect Predictors"
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering
Data Mining Techniques for Building Fault-proneness Models in Telecom Java Software
ISSRE '07 Proceedings of the The 18th IEEE International Symposium on Software Reliability
Comparing design and code metrics for software quality prediction
Proceedings of the 4th international workshop on Predictor models in software engineering
Can data transformation help in the detection of fault-prone modules?
DEFECTS '08 Proceedings of the 2008 workshop on Defects in large software systems
Techniques for evaluating fault prediction models
Empirical Software Engineering
IEEE Transactions on Software Engineering
Cost Curve Evaluation of Fault Prediction Models
ISSRE '08 Proceedings of the 2008 19th International Symposium on Software Reliability Engineering
Evaluating Defect Prediction Models for a Large Evolving Software System
CSMR '09 Proceedings of the 2009 European Conference on Software Maintenance and Reengineering
Replication of defect prediction studies: problems, pitfalls and recommendations
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Programmer-based fault prediction
Proceedings of the 6th International Conference on Predictive Models in Software Engineering
Review: Software fault prediction: A literature review and current trends
Expert Systems with Applications: An International Journal
Pragmatic prioritization of software quality assurance efforts
Proceedings of the 33rd International Conference on Software Engineering
High-impact defects: a study of breakage and surprise defects
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Evaluating defect prediction approaches: a benchmark and an extensive comparison
Empirical Software Engineering
A study of subgroup discovery approaches for defect prediction
Information and Software Technology
Is lines of code a good measure of effort in effort-aware models?
Information and Software Technology
An in-depth study of the potentially confounding effect of class size in fault prediction
ACM Transactions on Software Engineering and Methodology (TOSEM)
Hi-index | 0.00 |
Defect Prediction Models aim at identifying error-prone parts of a software system as early as possible. Many such models have been proposed, their evaluation, however, is still an open question, as recent publications show. An important aspect often ignored during evaluation is the effort reduction gained by using such models. Models are usually evaluated per module by performance measures used in information retrieval, such as recall, precision, or the area under the ROC curve (AUC). These measures assume that the costs associated with additional quality assurance activities are the same for each module, which is not reasonable in practice. For example, costs for unit testing and code reviews are roughly proportional to the size of a module. In this paper, we investigate this discrepancy using optimal and trivial models. We describe a trivial model that takes only the module size measured in lines of code into account, and compare it to five classification methods. The trivial model performs surprisingly well when evaluated using AUC. However, when an effort-sensitive performance measure is used, it becomes apparent that the trivial model is in fact the worst.