Continuous outlier detection in data streams: an extensible framework and state-of-the-art algorithms

  • Authors:
  • Dimitrios Georgiadis;Maria Kontaki;Anastasios Gounaris;Apostolos N. Papadopoulos;Kostas Tsichlas;Yannis Manolopoulos

  • Affiliations:
  • Aristotle University, Thessaloniki, Greece;Aristotle University, Thessaloniki, Greece;Aristotle University, Thessaloniki, Greece;Aristotle University, Thessaloniki, Greece;Aristotle University, Thessaloniki, Greece;Aristotle University, Thessaloniki, Greece

  • Venue:
  • Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Anomaly detection is an important data mining task, aiming at the discovery of elements that show significant diversion from the expected behavior; such elements are termed as outliers. One of the most widely employed criteria for determining whether an element is an outlier is based on the number of neighboring elements within a fixed distance (R), against a fixed threshold (k). Such outliers are referred to as distance-based outliers and are the focus of this work. In this demo, we show both an extendible framework for outlier detection algorithms and specific outlier detection algorithms for the demanding case where outlier detection is continuously performed over a data stream. More specifically: i) first we demonstrate a novel flavor of an open-source publicly available tool for Massive Online Analysis (MOA) that is endowed with capabilities to encapsulate algorithms that continuously detect outliers and ii) second, we present four online outlier detection algorithms. Two of these algorithms have been designed by the authors of this demo, with a view to improving on key aspects related to outlier mining, such as running time, flexibility and space requirements.