The fundamental theory of optimal "Anti-Bayesian" parametric pattern classification using order statistics criteria

Authors:
A. Thomas;B. John Oommen
Affiliations:
School of Computer Science, Carleton University, Ottawa, Canada K1S 5B6;School of Computer Science, Carleton University, Ottawa, Canada K1S 5B6
Venue:
Pattern Recognition
Year:
2013

Citing 8
Cited 2

On Using Prototype Reduction Schemes and Classifier Fusion Strategies to Optimize Kernel-Based Nonlinear Subspace Methods

IEEE Transactions on Pattern Analysis and Machine Intelligence
Finding Prototypes For Nearest Neighbor Classifiers

IEEE Transactions on Computers
Full border identification for reduction of training sets

Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study

IEEE Transactions on Pattern Analysis and Machine Intelligence
The condensed nearest neighbor rule (Corresp.)

IEEE Transactions on Information Theory
The reduced nearest neighbor rule (Corresp.)

IEEE Transactions on Information Theory
An algorithm for a selective nearest neighbor decision rule (Corresp.)

IEEE Transactions on Information Theory
A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Order statistics-based parametric classification for multi-dimensional distributions

Pattern Recognition
"Anti-Bayesian" parametric pattern classification using order statistics criteria for some members of the exponential family

Pattern Recognition

Quantified Score

Hi-index	0.01

Visualization

Abstract

The gold standard for a classifier is the condition of optimality attained by the Bayesian classifier. Within a Bayesian paradigm, if we are allowed to compare the testing sample with only a single point in the feature space from each class, the optimal Bayesian strategy would be to achieve this based on the (Mahalanobis) distance from the corresponding means. The reader should observe that, in this context, the mean, in one sense, is the most central point in the respective distribution. In this paper, we shall show that we can obtain optimal results by operating in a diametrically opposite way, i.e., a so-called ''anti-Bayesian'' manner. Indeed, we assert a completely counter-intuitive result that by working with a very few points distant from the mean, one can obtain remarkable classification accuracies. The number of points can sometimes be as small as two. Further, if these points are determined by the order statistics of the distributions, the accuracy of our method, referred to as Classification by Moments of Order Statistics (CMOS), attains the optimal Bayes' bound. This claim, which is totally counter-intuitive, has been proven for many uni-dimensional, and some multi-dimensional distributions within the exponential family, and the theoretical results have been verified by rigorous experimental testing. Apart from the fact that these results are quite fascinating and pioneering in their own right, they also give a theoretical foundation for the families of Border Identification (BI) algorithms reported in the literature.