PINE: Podium Incremental Neighbor Evaluator for classifying spatial data

  • Authors:
  • William Perrizo;Qin Ding;Anne Denton;Kirk Scott;Qiang Ding;Maleq Khan

  • Affiliations:
  • North Dakota State University, Fargo, ND;Penn State Harrisburg, Middletown, PA;North Dakota State University, Fargo, ND;University of Alaska Anchorage, Anchorage, AK;North Dakota State University, Fargo, ND;Purdue University, West Lafayette, IN

  • Venue:
  • Proceedings of the 2003 ACM symposium on Applied computing
  • Year:
  • 2003
  • DataMIME™

    SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a set of training data, nearest neighbor classification predicts the class value for an unknown tuple X by searching the training set for the k nearest neighbors to X and then classifying X according to the most frequent class among the k neighbors. Each of the k nearest neighbors casts an equal vote for the class of X. In this paper, we propose a new algorithm, Podium Incremental Neighbor Evaluator (PINE), in which nearest neighbors are weighted for voting. A metric called HOBBit is used as the distance metric, and a data structure, the P-tree, is used for efficient implementation of the PINE algorithm on spatial data. Our experiments show that by using a Gaussian podium function, PINE outperforms the k-nearest neighbor (KNN) method in terms of classification accuracy for spatial data. In addition, in the PINE algorithm, all the instances are potential neighbors so that the value of k need not be pre-specified as in KNN methods. By assigning high weights to the nearest neighbors and low (even zero) weights to other neighbors, high classification accuracy can be achieved.