Empirical Evaluation of Mixed-Project Defect Prediction Models

Authors:
Burak Turhan;Ayse Tosun;Ayse Bener
Affiliations:
-;-;-
Venue:
SEAA '11 Proceedings of the 2011 37th EUROMICRO Conference on Software Engineering and Advanced Applications
Year:
2011

Citing 0
Cited 3

Recalling the "imprecision" of cross-project defect prediction

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Empirical evaluation of the effects of mixed project data on learning defect predictors

Information and Software Technology
How, and why, process metrics are better

Proceedings of the 2013 International Conference on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Defect prediction research mostly focus on optimizing the performance of models that are constructed for isolated projects. On the other hand, recent studies try to utilize data across projects for building defect prediction models. We combine both approaches and investigate the effects of using mixed (i.e. within and cross) project data on defect prediction performance, which has not been addressed in previous studies. We conduct experiments to analyze models learned from mixed project data using ten proprietary projects from two different organizations. We observe that code metric based mixed project models yield only minor improvements in the prediction performance for a limited number of cases that are difficult to characterize. Based on existing studies and our results, we conclude that using cross project data for defect prediction is still an open challenge that should only be considered in environments where there is no local data collection activity, and using data from other projects in addition to a project's own data does not pay off in terms of performance.