Boosting k-nearest neighbor classifier by means of input space projection

  • Authors:
  • Nicolás García-Pedrajas;Domingo Ortiz-Boyer

  • Affiliations:
  • Department of Computing and Numerical Analysis, University of Córdoba, Campus de Rabanales, 14071 Córdoba, Spain;Department of Computing and Numerical Analysis, University of Córdoba, Campus de Rabanales, 14071 Córdoba, Spain

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

The k-nearest neighbors classifier is one of the most widely used methods of classification due to several interesting features, such as good generalization and easy implementation. Although simple, it is usually able to match, and even beat, more sophisticated and complex methods. However, no successful method has been reported so far to apply boosting to k-NN. As boosting methods have proved very effective in improving the generalization capabilities of many classification algorithms, proposing an appropriate application of boosting to k-nearest neighbors is of great interest. Ensemble methods rely on the instability of the classifiers to improve their performance, as k-NN is fairly stable with respect to resampling, these methods fail in their attempt to improve the performance of k-NN classifier. On the other hand, k-NN is very sensitive to input selection. In this way, ensembles based on subspace methods are able to improve the performance of single k-NN classifiers. In this paper we make use of the sensitivity of k-NN to input space for developing two methods for boosting k-NN. The two approaches modify the view of the data that each classifier receives so that the accurate classification of difficult instances is favored. The two approaches are compared with the classifier alone and bagging and random subspace methods with a marked and significant improvement of the generalization error. The comparison is performed using a large test set of 45 problems from the UCI Machine Learning Repository. A further study on noise tolerance shows that the proposed methods are less affected by class label noise than the standard methods.