Empirical Evaluation of Mixed-Project Defect Prediction Models

  • Authors:
  • Burak Turhan;Ayse Tosun;Ayse Bener

  • Affiliations:
  • -;-;-

  • Venue:
  • SEAA '11 Proceedings of the 2011 37th EUROMICRO Conference on Software Engineering and Advanced Applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Defect prediction research mostly focus on optimizing the performance of models that are constructed for isolated projects. On the other hand, recent studies try to utilize data across projects for building defect prediction models. We combine both approaches and investigate the effects of using mixed (i.e. within and cross) project data on defect prediction performance, which has not been addressed in previous studies. We conduct experiments to analyze models learned from mixed project data using ten proprietary projects from two different organizations. We observe that code metric based mixed project models yield only minor improvements in the prediction performance for a limited number of cases that are difficult to characterize. Based on existing studies and our results, we conclude that using cross project data for defect prediction is still an open challenge that should only be considered in environments where there is no local data collection activity, and using data from other projects in addition to a project's own data does not pay off in terms of performance.