Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: A Case Study of OpenBSD

Authors:
Paul Luo Li;Jim Herbsleb;Mary Shaw
Affiliations:
Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University
Venue:
METRICS '05 Proceedings of the 11th IEEE International Software Metrics Symposium
Year:
2005

Citing 0
Cited 5

Experiences and results from initiating field defect prediction and product test prioritization efforts at ABB Inc.

Proceedings of the 28th international conference on Software engineering
Misclassification cost-sensitive fault prediction models

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
On Reducing the Pre-release Failures of Web Plug-In on Social Networking Site

ICSP '09 Proceedings of the International Conference on Software Process: Trustworthy Software Development Processes
Approximating deployment metrics to predict field defects and plan corrective maintenance activities

ISSRE'09 Proceedings of the 20th IEEE international conference on software reliability engineering
Are popular classes more defect prone?

FASE'10 Proceedings of the 13th international conference on Fundamental Approaches to Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Open source software systems are important components of many business software applications. Field defect predictions for open source software systems may allow organizations to make informed decisions regarding open source software components. In this paper, we remotely measure and analyze predictors (metrics available before release) mined from established data sources (the code repository and the request tracking system) as well as a novel source of data (mailing list archives) for nine releases of OpenBSD. First, we attempt to predict field defects by extending a software reliability model fitted to development defects. We find this approach to be infeasible, which motivates examining metrics-based field defect prediction. Then, we evaluate 139 predictors using established statistical methods: Kendallýs rank correlation, Pearsonýs rank correlation, and forward AIC model selection. The metrics we collect include product metrics, development metrics, deployment and usage metrics, and software and hardware configurations metrics. We find the number of messages to the technical discussion mailing list during the development period (a deployment and usage metric captured from mailing list archives) to be the best predictor of field defects. Our work identifies predictors of field defects in commonly available data sources for open source software systems and is a step towards metricsbased field defect prediction for quantitatively-based decision making regarding open source software components.