Outlier detection with streaming dyadic decomposition

  • Authors:
  • Chetan Gupta;Robert Grossman

  • Affiliations:
  • Dept. of Mathematics, Statistics and Computer Science, University of Illinois, Chicago and Hewlett Packard Labs;Open Data Partners

  • Venue:
  • ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this work we introduce a new algorithm for detecting outliers on streaming data in Rn. The basic idea is to compute a dyadic decomposition into cubes in Rn of the streaming data. Dyadic decomposition can be obtained by recursively bisecting the cube the data lies in. Dyadic decomposition obtained under streaming setting is understood as streaming dyadic decomposition. If we view the streaming dyadic decomposition as a tree with a fixed maximum (and sufficient) size (depth), then outliers are naturally defined by cubes that contain a small number of points in the cube itself or the cube itself and its neighboring cubes. We discuss some properties of detecting outliers with streaming dyadic decomposition and we present experimental results over real and artificial data sets.