Probabilistic combination of text classifiers using reliability indicators: models and results
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
In this paper, we study various K nearest neighbor (KNN) algorithms and present a new KNN algorithm based on evidence theory. We introduce global frequency estimation of prior probability (GE) and local frequency estimation of prior probability (LE). A GE for a class is the prior probability of the class across the whole training data space based on frequency estimation; on the other hand, a LE for a class in a particular neighborhood is the prior probability of the class in this neighborhood space based on frequency estimation. By considering the difference between the GE and the LE of each class, we present a solution to the imbalanced data problem in some degree without doing re-sampling. We compare our algorithm with other KNN algorithms using two benchmark datasets. Results show that our KNN algorithm outperforms other KNN algorithms, including basic evidence based KNN.