AnyOut: anytime outlier detection on streaming data

  • Authors:
  • Ira Assent;Philipp Kranen;Corinna Baldauf;Thomas Seidl

  • Affiliations:
  • Dept. of Computer Science, Aarhus University, Denmark;Data Management and Data Exploration Group, RWTH Aachen University, Germany;Data Management and Data Exploration Group, RWTH Aachen University, Germany;Data Management and Data Exploration Group, RWTH Aachen University, Germany

  • Venue:
  • DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the increase of sensor and monitoring applications, data mining on streaming data is receiving increasing research attention. As data is continuously generated, mining algorithms need to be able to analyze the data in a one-pass fashion. In many applications the rate at which the data objects arrive varies greatly. This has led to anytime mining algorithms for classification or clustering. They successfully mine data until the a priori unknown point of interruption by the next data in the stream. In this work we investigate anytime outlier detection. Anytime outlier detection denotes the problem of determining within any period of time whether an object in a data stream is anomalous. The more time is available, the more reliable the decision should be. We introduce AnyOut, an algorithm capable of solving anytime outlier detection, and investigate different approaches to build up the underlying data structure. We propose a confidence measure for AnyOut that allows to improve the performance on constant data streams. We evaluate our method in thorough experiments and demonstrate its performance in comparison with established algorithms for outlier detection.