Parameterless outlier detection in data streams

Authors:
Alice Marascu;Florent Masseglia
Affiliations:
INRIA, route des lucioles - BP;INRIA, route des lucioles - BP
Venue:
Proceedings of the 2009 ACM symposium on Applied Computing
Year:
2009

Citing 10
Cited 0

Ten lectures on wavelets

Ten lectures on wavelets
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Two-phase clustering process for outliers detection

Pattern Recognition Letters
Mining top-n local outliers in large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Discovering cluster-based local outliers

Pattern Recognition Letters
ADMIT: anomaly-based data mining for intrusions

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An approach to spacecraft anomaly detection problem using kernel feature space

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining sequential patterns from data streams: a centroid approach

Journal of Intelligent Information Systems
A nonparametric outlier detection for effectively discovering top-n outliers from engineering data

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Outlyingness is a subjective concept relying on the isolation level of a (set of) record(s). Clustering-based outlier detection is a field that aims to cluster data and to detect outliers depending on their characteristics (small, tight and/or dense clusters might be considered as outliers). Existing methods require a parameter standing for the "level of outlyingness", such as the maximum size or a percentage of small clusters, in order to build the set of outliers. Unfortunately, manually setting this parameter in a streaming environment should not be possible, given the fast time response usually needed. In this paper we propose WOD, a method that separates outliers from clusters thanks to a natural and effective principle. The main advantages of WOD are its ability to automatically adjust to any clustering result and to be parameterless.