Peculiarity Analysis for Classifications

  • Authors:
  • Jian Yang;Ning Zhong;Yiyu Yao;Jue Wang

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Peculiarity-oriented mining (POM) is a new data mining method consisting of peculiar data identification and peculiar data analysis. Peculiarity factor (PF) and local peculiarity factor (LPF) are important concepts employed to describe the peculiarity of points in the identification step. One can study the notions at both attribute and record levels. In this paper, a new record LPF called distance based record LPF (D-record LPF) is proposed, which is defined as the sum of distances between a point and its nearest neighbors. It is proved mathematically that D-record LPF can characterize accurately the probability density function of a continuous m-dimensional distribution. This provides a theoretical basis for some existing distance based anomaly detection techniques. More important, it also provides an effective method for describing the class conditional probabilities in the Bayesian classifier. The result enables us to apply peculiarity analysis for classification problems. A novel algorithm called LPF-Bayes classifier and its kernelized implementation are presented, which have some connection to the Bayesian classifier. Experimental results on several benchmark data sets demonstrate that the proposed classifiers are effective.