Nearest neighbor sampling for cross company defect predictors: abstract only

  • Authors:
  • Burak Turhan;Ayşe Bener;Tim Menzies

  • Affiliations:
  • Boǧaziçi University, Bebek, Istanbul, Turkey;Boǧaziçi University, Bebek, Istanbul, Turkey;West Virginia University, Morgantown, WV

  • Venue:
  • DEFECTS '08 Proceedings of the 2008 workshop on Defects in large software systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several research in defect prediction focus on building models with available local data (i.e. within company predictors). To employ these models, a company should have a data repository, where project metrics and defect information from past projects are stored. However, few companies apply this practice. In a recent work, we have shown that cross company data can be used for building predictors with the cost of increased false alarms. Thus, we argued that the practical application of cross-company predictors is limited to mission critical projects and companies should starve for local data. In this paper, we show that nearest neighbor (NN) sampling of cross-company data removes the increased false alarm rates. We conclude that cross company defect predictors can be practical tools with NN sampling, yet local predictors are still the best and companies should keep starving for local data.