Detection of Local Outlier over Dynamic Data Streams Using Efficient Partitioning Method

  • Authors:
  • Manzoor Elahi;Kun Li;Wasif Nisar;Xinjie Lv;Hongan Wang

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • CSIE '09 Proceedings of the 2009 WRI World Congress on Computer Science and Information Engineering - Volume 04
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Outlier detection is the process of detecting the data objects which are grossly different from or inconsistent with the remaining set of data. Some of the important applications in the field of data mining are fraud detection, customer behavior analysis, and intrusion detection. There are number of good research algorithms for detecting outliers if the entire data is available and algorithms can operate in more than single passes to achieve the required results. Among the existing methods, LOF (Local outlier Factor) a density based method is very efficient in detecting all forms of outliers. LOF algorithm can not be directly applied to the datastream as the large number of nearest neighbor searches, LOF computation and lrd (local reachability distances) can make it highly inefficient for datastream. In this paper we propose a cluster based partitioning algorithm which can divide the stream in safe region and candidate regions. In Second phase apply LOF algorithm over these partitions separately with some slight enhancement for LOF computation over candidate region to achieve accurate results for finding most outstanding outliers. Several experiments on different dataset confirm that our technique can find better outliers with low computational cost than the direct LOF or compared to the other enhancements proposed for LOF.