Finding centric local outliers in categorical/numerical spaces

Authors:
Jeffrey Xu Yu;Weining Qian;Hongjun Lu;Aoying Zhou
Affiliations:
The Chinese University of Hong Kong, Shatin, N.T., Hong Kong, China;Fudan University, Shanghai, China;The Hong Kong University of Science and Technology, Hong Kong, China;Fudan University, Shanghai, China
Venue:
Knowledge and Information Systems
Year:
2006

Citing 18
Cited 7

Computational geometry: an introduction

Computational geometry: an introduction
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Computing depth contours of bivariate point clouds

Computational Statistics & Data Analysis - Special issue on classification
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mining top-n local outliers in large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting graph-based spatial outliers: algorithms and applications (a summary of results)

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
OPTICS-OF: Identifying Local Outliers

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
ROCK: A Robust Clustering Algorithm for Categorical Attributes

ICDE '99 Proceedings of the 15th International Conference on Data Engineering

Outlier Detection Based on Voronoi Diagram

ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Anomaly detection: A survey

ACM Computing Surveys (CSUR)
“Best K”: critical clustering structures in categorical datasets

Knowledge and Information Systems
A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes

Data Mining and Knowledge Discovery
Exceptional Object Analysis for Finding Rare Environmental Events from water quality datasets

Neurocomputing
Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection

Data Mining and Knowledge Discovery
Review: A review of novelty detection

Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Outlier detection techniques are widely used in many applications such as credit-card fraud detection, monitoring criminal activities in electronic commerce, etc. These applications attempt to identify outliers as noises, exceptions, or objects around the border. The existing density-based local outlier detection assigns the degree to which an object is an outlier in a numerical space. In this paper, we propose a novel mutual-reinforcement-based local outlier detection approach. Instead of detecting local outliers as noise, we attempt to identify local outliers in the center, where they are similar to some clusters of objects on one hand, and are unique on the other. Our technique can be used for bank investment to identify a unique body, similar to many good competitors, in which to invest. We attempt to detect local outliers in categorical, ordinal as well as numerical data. In categorical data, the challenge is that there are many similar but different ways to specify relationships among the data items. Our mutual-reinforcement-based approach is stable, with similar but different user-defined relationships. Our technique can reduce the burden for users to determine the relationships among data items, and find the explanations why the outliers are found. We conducted extensive experimental studies using real datasets.