Correlation-based detection of attribute outliers

Authors:
Judice L. Y. Koh;Mong Li Lee;Wynne Hsu;Kai Tak Lam
Affiliations:
Institute for Infocomm Research, Singapore and School of Computing, National University of Singapore;School of Computing, National University of Singapore;School of Computing, National University of Singapore;Institute of High Performance Computing, Singapore
Venue:
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Year:
2007

Citing 11
Cited 4

Robust regression and outlier detection

Robust regression and outlier detection
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Two-phase clustering process for outliers detection

Pattern Recognition Letters
Mining top-n local outliers in large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Distance-based outliers: algorithms and applications

The VLDB Journal — The International Journal on Very Large Data Bases
Discovering cluster-based local outliers

Pattern Recognition Letters
Polishing Blemishes: Issues in Data Correction

IEEE Intelligent Systems
A vertical distance-based outlier detection method with local pruning

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Filtering erroneous protein annotation

Bioinformatics
Class noise vs. attribute noise: a quantitative study of their impacts

Artificial Intelligence Review

Detecting Aggregate Incongruities in XML

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Attribute outlier detection over data streams

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Measuring stability of feature ranking techniques: a noise-based approach

International Journal of Business Intelligence and Data Mining
Don't be SCAREd: use SCalable Automatic REpairing with maximal likelihood and bounded changes

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

An outlier is an object that does not conform to the normal behavior of the data set. In data cleaning, outliers are identified for data noise reduction. In applications such as fraud detection, and stock market analysis, outliers suggest abnormal behavior requiring further investigation. Existing outlier detection methods have focused on class outliers and research on attribute outliers is limited, despite the equal role attribute outliers play in depreciating data quality and reducing data mining accuracy. In this paper, we propose a novel method to detect attribute outliers from the deviating correlation behavior of attributes. We formulate three metrics to evaluate outlier-ness of attributes, and introduce an adaptive factor to distinguish outliers from non-outliers. Experiments with both synthetic and real-world data sets indicate that the proposed method is effective in detecting attribute outliers.