Distance-based outlier queries in data streams: the novel task and algorithms

  • Authors:
  • Fabrizio Angiulli;Fabio Fassetti

  • Affiliations:
  • DEIS, Università della Calabria, Rende, Italy 87036;DEIS, Università della Calabria, Rende, Italy 87036

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work proposes a method for detecting distance-based outliers in data streams under the sliding window model. The novel notion of one-time outlier query is introduced in order to detect anomalies in the current window at arbitrary points-in-time. Three algorithms are presented. The first algorithm exactly answers to outlier queries, but has larger space requirements than the other two. The second algorithm is derived from the exact one, reduces memory requirements and returns an approximate answer based on estimations with a statistical guarantee. The third algorithm is a specialization of the approximate algorithm working with strictly fixed memory requirements. Accuracy properties and memory consumption of the algorithms have been theoretically assessed. Moreover experimental results have confirmed the effectiveness of the proposed approach and the good quality of the solutions.