Atypicity detection in data streams: A self-adjusting approach

Authors:
Alice Marascu;Florent Masseglia
Affiliations:
INRIA Sophia-Antipolis, Sophia-Antipolis, France;(Correspd. E-mail: florent.Masseglia@sophia.inria.fr) INRIA Sophia-Antipolis, Sophia-Antipolis, France
Venue:
Intelligent Data Analysis - Ubiquitous Knowledge Discovery
Year:
2011

Citing 22
Cited 1

Ten lectures on wavelets

Ten lectures on wavelets
An introduction to wavelets

An introduction to wavelets
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
BACON: blocked adaptive computationally efficient outlier nominators

Computational Statistics & Data Analysis
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Two-phase clustering process for outliers detection

Pattern Recognition Letters
Mining top-n local outliers in large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Wavelets for Computer Graphics: A Primer, Part 1

IEEE Computer Graphics and Applications
Findout: finding outliers in very large datasets

Knowledge and Information Systems
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Discovering cluster-based local outliers

Pattern Recognition Letters
A survey on wavelet applications in data mining

ACM SIGKDD Explorations Newsletter
ADMIT: anomaly-based data mining for intrusions

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Information-Theoretic Measures for Anomaly Detection

SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Novelty detection: a review—part 1: statistical approaches

Signal Processing
An approach to spacecraft anomaly detection problem using kernel feature space

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining sequential patterns from data streams: a centroid approach

Journal of Intelligent Information Systems
Unsupervised Anomaly Detection in Network Traffic by Means of Robust PCA

ICCGI '07 Proceedings of the International Multi-Conference on Computing in the Global Information Technology
A nonparametric outlier detection for effectively discovering top-n outliers from engineering data

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

A similarity-based approach for data stream classification

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Outlyingness is a subjective concept relying on the isolation level of a (set of) record(s). Clustering-based outlier detection is a field that aims to cluster data and to detect outliers depending on their characteristics (i.e. small, tight and/or dense clusters might be considered as outliers). Existing methods require a parameter standing for the "level of outlyingness", such as the maximum size or a percentage of small clusters, in order to build the set of outliers. Unfortunately, manually setting this parameter in a streaming environment should not be possible, given the fast time response usually needed. In this paper we propose Wod, a method that separates outliers from clusters thanks to a natural and effective principle. The main advantages of Wod are its ability to automatically adjust to any clustering result and to be parameterless.