An interactive approach to outlier detection

Authors:
R. M. Konijn;W. Kowalczyk
Affiliations:
Department of Computer Science, Vrije Universiteit Amsterdam;Department of Computer Science, Vrije Universiteit Amsterdam
Venue:
RSKT'10 Proceedings of the 5th international conference on Rough set and knowledge technology
Year:
2010

Citing 6
Cited 0

LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering outlier filtering rules from unlabeled data: combining a supervised learner with an unsupervised learner

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Anomaly Detection over Noisy Data using Learned Probability Distributions

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Detecting anomalies in cross-classified streams: a Bayesian approach

Knowledge and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we describe an interactive approach for finding outliers in big sets of records, such as collected by banks, insurance companies, web shops. The key idea behind our approach is the usage of an easy-to-compute and easy-to-interpret outlier score function. This function is used to identify a set of potential outliers. The outliers, organized in clusters, are then presented to a domain expert, together with some context information, such as characteristics of clusters and distribution of scores. Consequently, they are analyzed, labelled as non-explainable or explainable, and removed from the data. The whole process is iterated several times, until no more interesting outliers can be found.