Atypicity detection in data streams: A self-adjusting approach

  • Authors:
  • Alice Marascu;Florent Masseglia

  • Affiliations:
  • INRIA Sophia-Antipolis, Sophia-Antipolis, France;(Correspd. E-mail: florent.Masseglia@sophia.inria.fr) INRIA Sophia-Antipolis, Sophia-Antipolis, France

  • Venue:
  • Intelligent Data Analysis - Ubiquitous Knowledge Discovery
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Outlyingness is a subjective concept relying on the isolation level of a (set of) record(s). Clustering-based outlier detection is a field that aims to cluster data and to detect outliers depending on their characteristics (i.e. small, tight and/or dense clusters might be considered as outliers). Existing methods require a parameter standing for the "level of outlyingness", such as the maximum size or a percentage of small clusters, in order to build the set of outliers. Unfortunately, manually setting this parameter in a streaming environment should not be possible, given the fast time response usually needed. In this paper we propose Wod, a method that separates outliers from clusters thanks to a natural and effective principle. The main advantages of Wod are its ability to automatically adjust to any clustering result and to be parameterless.