Nearest Neighbor Voting in High-Dimensional Data: Learning from Past Occurrences

  • Authors:
  • Nenad Tomasev;Dunja Mladenic

  • Affiliations:
  • -;-

  • Venue:
  • ICDMW '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hub ness is a recently described aspect of the curse of dimensionality inherent to nearest-neighbor methods. In this paper we present a new approach for exploiting the hub ness phenomenon in k-nearest neighbor classification. We argue that some of the neighbor occurrences carry more information than others, by the virtue of being less frequent events. This observation is related to the hub ness phenomenon and we explore how it affects high-dimensional k-nearest neighbor classification. We propose a new algorithm, Hub ness Information k-Nearest Neighbor (HIKNN), which introduces the k-occurrence informativeness into the hub ness-aware k-nearest neighbor voting framework. Our evaluation on high-dimensional data shows significant improvements over both the basic k-nearest neighbor approach and all previously used hub ness-aware approaches.