Classification on defective items using unidentified samples

Authors:
Tze Fen Li;Shui-Ching Chang
Affiliations:
Department of Applied Mathematics, National Chung Hsing University, Kuo-Kuang Road, Taichung 40227, Taiwan, ROC;Department of Applied Mathematics, National Chung Hsing University, Kuo-Kuang Road, Taichung 40227, Taiwan, ROC
Venue:
Pattern Recognition
Year:
2005

Citing 3
Cited 0

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Defect Prevention through Defect Prediction: A Case Study at Infosys

ICSM '01 Proceedings of the IEEE International Conference on Software Maintenance (ICSM'01)
Defect Content Estimation for Two Reviewers

ISSRE '01 Proceedings of the 12th International Symposium on Software Reliability Engineering

Quantified Score

Hi-index	0.01

Visualization

Abstract

A set of unlabelled items is used to establish a decision rule to classify defective items. The lifetime of an item has an exponential distribution. It is known that the Bayes decision rule, which classifies good and defective items, gives a minimum probability of misclassification. The Bayes decision rule needs to know the prior probability (defective percentage) and two mean lifetimes. In the set of unidentified samples, the defective percentage and two mean lifetimes are unknown. Hence, before we can use the Bayes decision rule, we have to estimate the three unknown parameters. In this study, a set of unlabelled samples is used to estimate the three unknown parameters. The Bayes decision rule with these estimated parameters is an empirical Bayes (EB) decision rule. A stochastic approximation procedure using the set of unidentified samples is established to estimate the three unknown parameters. When the size of unlabelled items increases, the estimates computed by the procedure converge to the real parameters and hence gradually adapt our EB decision rule to be a better classifier until it becomes the Bayes decision rule. The results of a Monte Carlo simulation study are presented to demonstrate the convergence of the correct classification rates made by the EB decision rule to the highest correct classification rates given by the Bayes decision rule.