An Efficient Outlier Mining Algorithm for Large Dataset

  • Authors:
  • Peng Yang;Biao Huang

  • Affiliations:
  • -;-

  • Venue:
  • ICIII '08 Proceedings of the 2008 International Conference on Information Management, Innovation Management and Industrial Engineering - Volume 01
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Since an outlier often contains useful information, outlier detection is becoming a hot issue in data mining. Thus, an efficient outlier mining algorithm based on KNN is proposed in this paper. It can find outlier more accurately through defining a correlation matrix considering the importance and correlation between attributes. In addition, a data structure R-tree is used in the algorithm and it utilizes pruning scheme to drastically reduce the time consuming of computing. Experimental results show that our algorithm is more efficient than the traditional KNN algorithm. It will provide an effective solution for outlier mining in large dataset.