Leveraging k-NN for generic classification boosting

Authors:
Paolo Piro;Richard Nock;Frank Nielsen;Michel Barlaud
Affiliations:
CNRS/University of Nice-Sophia Antipolis, France and Italian Institute of Technology, Italy;CEREGMIA, University of Antilles-Guyane, France;LIX, Ecole Polytechnique, France and Sony Computer Science Laboratories, Japan;CNRS/University of Nice-Sophia Antipolis, France
Venue:
Neurocomputing
Year:
2012

Citing 9
Cited 4

Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Advances in Instance Selection for Instance-Based Learning Algorithms

Data Mining and Knowledge Discovery
Induction of Decision Trees

Machine Learning
Totally corrective boosting algorithms that maximize the margin

ICML '06 Proceedings of the 23rd international conference on Machine learning
Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing)

Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing)
Information-theoretic metric learning

Proceedings of the 24th international conference on Machine learning
Bregman Divergences and Surrogates for Learning

IEEE Transactions on Pattern Analysis and Machine Intelligence
Nearest neighbor pattern classification

IEEE Transactions on Information Theory
The condensed nearest neighbor rule (Corresp.)

IEEE Transactions on Information Theory

Boosting k-NN for Categorization of Natural Scenes

International Journal of Computer Vision
Letters: Dynamic classifier ensemble using classification confidence

Neurocomputing
Boosting nearest neighbors for the efficient estimation of posteriors

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Classification of uncertain and imprecise data based on evidence theory

Neurocomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Voting rules relying on k-nearest neighbors (k-NN) are an effective tool in countless many machine learning techniques. Thanks to its simplicity, k-NN classification is very attractive to practitioners, as it enables very good performances in several practical applications. However, it suffers from various drawbacks, like sensitivity to ''noisy'' instances and poor generalization properties when dealing with sparse high-dimensional data. In this paper, we tackle the k-NN classification problem at its core by providing a novel k-NN boosting approach. Namely, we propose a supervised learning algorithm, called Universal Nearest Neighbors (UNN), that induces a leveraged k-NN rule by globally minimizing a surrogate risk upper bounding the empirical misclassification rate over training data. Interestingly, this surrogate risk can be arbitrary chosen from a class of Bregman loss functions, including the familiar exponential, logistic and squared losses. Furthermore, we show that UNN allows to efficiently filter a dataset of instances by keeping only a small fraction of data. Experimental results on the synthetic Ripley's dataset show that such a filtering strategy is able to reject ''noisy'' examples, and yields a classification error close to the optimal Bayes error. Experiments on standard UCI datasets show significant improvements over the current state of the art.