Nearest neighbor sampling for cross company defect predictors: abstract only

Authors:
Burak Turhan;Ayşe Bener;Tim Menzies
Affiliations:
Boǧaziçi University, Bebek, Istanbul, Turkey;Boǧaziçi University, Bebek, Istanbul, Turkey;West Virginia University, Morgantown, WV
Venue:
DEFECTS '08 Proceedings of the 2008 workshop on Defects in large software systems
Year:
2008

Citing 0
Cited 5

Practical considerations in deploying AI for defect prediction: a case study within the Turkish telecommunication industry

PROMISE '09 Proceedings of the 5th International Conference on Predictor Models in Software Engineering
Reducing false alarms in software defect prediction by decision threshold optimization

ESEM '09 Proceedings of the 2009 3rd International Symposium on Empirical Software Engineering and Measurement
Practical considerations in deploying statistical methods for defect prediction: A case study within the Turkish telecommunications industry

Information and Software Technology
Local vs. global models for effort estimation and defect prediction

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Influence of confirmation biases of developers on software quality: an empirical study

Software Quality Control

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several research in defect prediction focus on building models with available local data (i.e. within company predictors). To employ these models, a company should have a data repository, where project metrics and defect information from past projects are stored. However, few companies apply this practice. In a recent work, we have shown that cross company data can be used for building predictors with the cost of increased false alarms. Thus, we argued that the practical application of cross-company predictors is limited to mission critical projects and companies should starve for local data. In this paper, we show that nearest neighbor (NN) sampling of cross-company data removes the increased false alarm rates. We conclude that cross company defect predictors can be practical tools with NN sampling, yet local predictors are still the best and companies should keep starving for local data.