Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Defect Prevention through Defect Prediction: A Case Study at Infosys
ICSM '01 Proceedings of the IEEE International Conference on Software Maintenance (ICSM'01)
Defect Content Estimation for Two Reviewers
ISSRE '01 Proceedings of the 12th International Symposium on Software Reliability Engineering
Hi-index | 0.01 |
A set of unlabelled items is used to establish a decision rule to classify defective items. The lifetime of an item has an exponential distribution. It is known that the Bayes decision rule, which classifies good and defective items, gives a minimum probability of misclassification. The Bayes decision rule needs to know the prior probability (defective percentage) and two mean lifetimes. In the set of unidentified samples, the defective percentage and two mean lifetimes are unknown. Hence, before we can use the Bayes decision rule, we have to estimate the three unknown parameters. In this study, a set of unlabelled samples is used to estimate the three unknown parameters. The Bayes decision rule with these estimated parameters is an empirical Bayes (EB) decision rule. A stochastic approximation procedure using the set of unidentified samples is established to estimate the three unknown parameters. When the size of unlabelled items increases, the estimates computed by the procedure converge to the real parameters and hence gradually adapt our EB decision rule to be a better classifier until it becomes the Bayes decision rule. The results of a Monte Carlo simulation study are presented to demonstrate the convergence of the correct classification rates made by the EB decision rule to the highest correct classification rates given by the Bayes decision rule.