Understanding the impact of code and process metrics on post-release defects: a case study on the Eclipse project

Authors:
Emad Shihab;Zhen Ming Jiang;Walid M. Ibrahim;Bram Adams;Ahmed E. Hassan
Affiliations:
Queen's University, Kingston, ON, Canada;Queen's University, Kingston, ON, Canada;Queen's University, Kingston, ON, Canada;Queen's University, Kingston, ON, Canada;Queen's University, Kingston, ON, Canada
Venue:
Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
Year:
2010

Citing 27
Cited 7

RCS—a system for version control

Software—Practice & Experience
Multivariate data analysis with readings (2nd ed.)

Multivariate data analysis with readings (2nd ed.)
An Analysis of Several Software Defect Models

IEEE Transactions on Software Engineering
A Validation of Object-Oriented Design Metrics as Quality Indicators

IEEE Transactions on Software Engineering
Predicting Fault-Prone Software Modules in Telephone Switches

IEEE Transactions on Software Engineering
A Unified Framework for Coupling Measurement in Object-Oriented Systems

IEEE Transactions on Software Engineering
Exploring the relationship between design measures and software quality in object-oriented systems

Journal of Systems and Software
Predicting Fault Incidence Using Software Change History

IEEE Transactions on Software Engineering
The prediction of faulty classes using object-oriented design metrics

Journal of Systems and Software
An empirical evaluation of fault-proneness models

Proceedings of the 24th International Conference on Software Engineering
Leveraging Legacy System Dollars for E-Business

IT Professional
A Metrics Suite for Object Oriented Design

IEEE Transactions on Software Engineering
Empirical Analysis of CK Metrics for Object-Oriented Design Complexity: Implications for Software Defects

IEEE Transactions on Software Engineering
A complexity measure

ICSE '76 Proceedings of the 2nd international conference on Software engineering
Use of relative code churn measures to predict system defect density

Proceedings of the 27th international conference on Software engineering
Static analysis tools as early indicators of pre-release defect density

Proceedings of the 27th international conference on Software engineering
When do changes induce fixes?

MSR '05 Proceedings of the 2005 international workshop on Mining software repositories
Empirical Validation of Object-Oriented Metrics on Open Source Software for Fault Prediction

IEEE Transactions on Software Engineering
Mining metrics to predict component failures

Proceedings of the 28th international conference on Software engineering
Predicting fault-prone components in a java legacy system

Proceedings of the 2006 ACM/IEEE international symposium on Empirical software engineering
Predicting Defects for Eclipse

PROMISE '07 Proceedings of the Third International Workshop on Predictor Models in Software Engineering
A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction

Proceedings of the 30th international conference on Software engineering
Predicting faults using the complexity of code changes

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
MapReduce as a general framework to support research in Mining Software Repositories (MSR)

MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
Does calling structure information improve the accuracy of fault prediction?

MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
On the Relationship Between Change Coupling and Software Defects

WCRE '09 Proceedings of the 2009 16th Working Conference on Reverse Engineering
Software Dependencies, Work Dependencies, and Their Impact on Failures

IEEE Transactions on Software Engineering

Pragmatic prioritization of software quality assurance efforts

Proceedings of the 33rd International Conference on Software Engineering
Are change metrics good predictors for an evolving software product line?

Proceedings of the 7th International Conference on Predictive Models in Software Engineering
High-impact defects: a study of breakage and surprise defects

Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Method-level bug prediction

Proceedings of the ACM-IEEE international symposium on Empirical software engineering and measurement
Recalling the "imprecision" of cross-project defect prediction

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Studying the impact of social interactions on software quality

Empirical Software Engineering
Is lines of code a good measure of effort in effort-aware models?

Information and Software Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

Research studying the quality of software applications continues to grow rapidly with researchers building regression models that combine a large number of metrics. However, these models are hard to deploy in practice due to the cost associated with collecting all the needed metrics, the complexity of the models and the black box nature of the models. For example, techniques such as PCA merge a large number of metrics into composite metrics that are no longer easy to explain. In this paper, we use a statistical approach recently proposed by Cataldo et al. to create explainable regression models. A case study on the Eclipse open source project shows that only 4 out of the 34 code and process metrics impacts the likelihood of finding a post-release defect. In addition, our approach is able to quantify the impact of these metrics on the likelihood of finding post-release defects. Finally, we demonstrate that our simple models achieve comparable performance over more complex PCA-based models while providing practitioners with intuitive explanations for its predictions.