A probabilistic approach to nearest-neighbor classification: naive hubness bayesian kNN

  • Authors:
  • Nenad Tomasev;Miloa Radovanović;Dunja Mladenić;Mirjana Ivanović

  • Affiliations:
  • Institute Jošef Stefan, Ljubljana, Slovenia;Department of Mathematics and Informatics, Novi Sad, Serbia;Institute Jo~ef Stefan, Ljubljana, Slovenia;Department of Mathematics and Informatics, Novi Sad, Serbia

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most machine-learning tasks, including classification, involve dealing with high-dimensional data. It was recently shown that the phenomenon of hubness, inherent to high-dimensional data, can be exploited to improve methods based on nearest neighbors (NNs). Hubness refers to the emergence of points (hubs) that appear among the k NNs of many other points in the data, and constitute influential points for kNN classification. In this paper, we present a new probabilistic approach to kNN classification, naive hubness Bayesian k-nearest neighbor (NHBNN), which employs hubness for computing class likelihood estimates. Experiments show that NHBNN compares favorably to different variants of the kNN classifier, including probabilistic kNN (PNN) which is often used as an underlying probabilistic framework for NN classification, signifying that NHBNN is a promising alternative framework for developing probabilistic NN algorithms.