Statistical models vs. expert estimation for fault prediction in modified code - an industrial case study

Authors:
Piotr Tomaszewski;Jim Håkansson;Håkan Grahn;Lars Lundberg
Affiliations:
Department of Systems and Software Engineering, School of Engineering, Blekinge Institute of Technology, SE-372 25 Ronneby, Sweden;Department of Systems and Software Engineering, School of Engineering, Blekinge Institute of Technology, SE-372 25 Ronneby, Sweden;Department of Systems and Software Engineering, School of Engineering, Blekinge Institute of Technology, SE-372 25 Ronneby, Sweden;Department of Systems and Software Engineering, School of Engineering, Blekinge Institute of Technology, SE-372 25 Ronneby, Sweden
Venue:
Journal of Systems and Software
Year:
2007

Citing 31
Cited 5

A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Software metrics (2nd ed.): a rigorous and practical approach

Software metrics (2nd ed.): a rigorous and practical approach
Managerial Use of Metrics for Object-Oriented Software: An Exploratory Analysis

IEEE Transactions on Software Engineering
Investigating quality factors in object-oriented designs: an industrial case study

Proceedings of the 21st international conference on Software engineering
A Critique of Software Defect Prediction Models

IEEE Transactions on Software Engineering
Exploring the relationship between design measures and software quality in object-oriented systems

Journal of Systems and Software
The application of subjective estimates of effectiveness to controlling software inspections

Journal of Systems and Software - Special issue on software maintenance
The prediction of faulty classes using object-oriented design metrics

Journal of Systems and Software
Predicting with Sparse Data

IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Modelling fault-proneness statistically over a sequence of releases: a case study

Journal of Software Maintenance: Research and Practice
Software Engineering Economics

Software Engineering Economics
Migrating to Object Technology

Migrating to Object Technology
Accuracy of software quality models over multiple releases

Annals of Software Engineering
Early Risk-Management by Identification of Fault-prone Modules

Empirical Software Engineering
A Metrics Suite for Object Oriented Design

IEEE Transactions on Software Engineering
An Empirical Investigation of an Object-Oriented Software System

IEEE Transactions on Software Engineering
Quantitative Analysis of Faults and Failures in a Complex Software System

IEEE Transactions on Software Engineering
Predicting Fault-Proneness using OO Metrics: An Industrial Case Study

CSMR '02 Proceedings of the 6th European Conference on Software Maintenance and Reengineering
Evaluation of inspectors' defect estimation accuracy for a requirements document after individual inspection

APSEC '00 Proceedings of the Seventh Asia-Pacific Software Engineering Conference
A Study on Fault-Proneness Detection of Object-Oriented Systems

CSMR '01 Proceedings of the Fifth European Conference on Software Maintenance and Reengineering
Fault Prediction Modeling for Software Quality Estimation: Comparing Commonly Used Techniques

Empirical Software Engineering
Controlling Overfitting in Software Quality Models: Experiments with Regression Trees and Classification

METRICS '01 Proceedings of the 7th International Symposium on Software Metrics
An Empirical Analysis of Fault Persistence Through Software Releases

ISESE '03 Proceedings of the 2003 International Symposium on Empirical Software Engineering
Developing Fault Predictors for Evolving Software Systems

METRICS '03 Proceedings of the 9th International Symposium on Software Metrics
Analyzing Software Measurement Data with Clustering Techniques

IEEE Intelligent Systems
Team-Based Fault Content Estimation in the Software Inspection Process

Proceedings of the 26th International Conference on Software Engineering
Application of multivariate analysis for software fault prediction

Software Quality Control
Reducing Corrective Maintenance Effort Considering Module's History

CSMR '05 Proceedings of the Ninth European Conference on Software Maintenance and Reengineering
Use of relative code churn measures to predict system defect density

Proceedings of the 27th international conference on Software engineering
Predicting the Location and Number of Faults in Large Software Systems

IEEE Transactions on Software Engineering
Unsupervised learning for expert-based software quality estimation

HASE'04 Proceedings of the Eighth IEEE international conference on High assurance systems engineering

Practical development of an Eclipse-based software fault prediction tool using Naive Bayes algorithm

Expert Systems with Applications: An International Journal
Review: Software fault prediction: A literature review and current trends

Expert Systems with Applications: An International Journal
Software fault prediction for object oriented systems: a literature review

ACM SIGSOFT Software Engineering Notes
Evaluating three approaches to extracting fault data from software change repositories

PROFES'10 Proceedings of the 11th international conference on Product-Focused Software Process Improvement
Prediction of faults-slip-through in large software projects: an empirical evaluation

Software Quality Control

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical fault prediction models and expert estimations are two popular methods for deciding where to focus the fault detection efforts when the fault detection budget is limited. In this paper, we present a study in which we empirically compare the accuracy of fault prediction offered by statistical prediction models with the accuracy of expert estimations. The study is performed in an industrial setting. We invited eleven experts that are involved in the development of two large telecommunication systems. Our statistical prediction models are built on historical data describing one release of one of those systems. We compare the performance of these statistical fault prediction models with the performance of our experts when predicting faults in the latest releases of both systems. We show that the statistical methods clearly outperform the expert estimations. As the main reason for the superiority of the statistical models we see their ability to cope with large datasets. This makes it possible for statistical models to perform reliable predictions for all components in the system. This also enables prediction at a more fine-grain level, e.g., at the class instead of at the component level. We show that such a prediction is better both from the theoretical and from the practical perspective.