A Review and Empirical Evaluation of Feature Weighting Methods for aClass of Lazy Learning Algorithms

  • Authors:
  • Dietrich Wettschereck;David W. Aha;Takao Mohri

  • Affiliations:
  • GMD (German National Research Center for Information Technology), Schloß Birlinghoven, 53754 Sankt Augustin, Germany. E-mail: dietrich.wettschereck@gmd.de;Navy Center for Applied Research in Artificial Intelligence, Naval Research Laboratory, Washington, DC USA. E-mail: aha@aic.nrl.navy.mil;Hidehiko Tanaka Lab. Department of Electric Engineering, The University of Tokyo, 7-3-1 Hongo Bunkyo-ku, Tokyo 113 JAPAN. E-mail: mohri@mtl.t.u-tokyo.ac.jp

  • Venue:
  • Artificial Intelligence Review - Special issue on lazy learning
  • Year:
  • 1997

Quantified Score

Hi-index 0.01

Visualization

Abstract

Many lazy learning algorithms are derivatives of the k-nearestneighbor (k-NN) classifier, which uses a distance function togenerate predictions from stored instances. Several studies haveshown that k-NN‘s performance is highly sensitive to thedefinition of its distance function. Many k-NN variants have beenproposed to reduce this sensitivity by parameterizing the distancefunction with feature weights. However, these variants have not beencategorized nor empirically compared. This paper reviews a class ofweight-setting methods for lazy learning algorithms. We introduce aframework for distinguishing these methods and empirically comparethem. We observed four trends from our experiments and conductedfurther studies to highlight them. Our results suggest that methodswhich use performance feedback to assign weight settings demonstratedthree advantages over other methods: they require less pre-processing,perform better in the presence of interacting features, and generallyrequire less training data to learn good settings. We also found thatcontinuous weighting methods tend to outperform feature selectionalgorithms for tasks where some features are useful but less importantthan others.