A method of learning weighted similarity function to improve the performance of nearest neighbor

  • Authors:
  • Mansoor Zolghadri Jahromi;Elham Parvinnia;Robert John

  • Affiliations:
  • Centre for Computational Intelligence, Department of Informatics, De Montfort University, Leicester, LE1 9BH, UK;Department of Computer Science and Engineering, Shiraz University, Shiraz, Iran;Centre for Computational Intelligence, Department of Informatics, De Montfort University, Leicester, LE1 9BH, UK

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 0.07

Visualization

Abstract

The performance of Nearest Neighbor (NN) classifier is known to be sensitive to the distance (or similarity) function used in classifying a test instance. Another major disadvantage of NN is that it uses all training instances in the generalization phase. This can cause slow execution speed and high storage requirement when dealing with large datasets. In the past research, many solutions have been proposed to handle one or both of the above problems. In the scheme proposed in this paper, we tackle both of these problems by assigning a weight to each training instance. The weight of a training instance is used in the generalization phase to calculate the distance (or similarity) of a query pattern to that instance. The basic NN classifier can be viewed as a special case of this scheme that treats all instances equally (by assigning equal weight to all training instances). Using this form of weighted similarity measure, we propose a learning algorithm that attempts to maximize the leave-one-out (LV1) classification rate of the NN rule by adjusting the weights of the training instances. At the same time, the algorithm reduces the size of the training set and can be viewed as a powerful instance reduction technique. An instance having zero weight is not used in the generalization phase and can be virtually removed from the training set. We show that our scheme has comparable or better performance than some recent methods proposed in the literature for the task of learning the distance function and/or prototype reduction.