Detecting outliers in categorical record databases based on attribute associations

Authors:
Kazuyo Narita;Hiroyuki Kitagawa
Affiliations:
Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba, Ibaraki, Japan;Graduate School of Systems and Information Engineering, Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
Venue:
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Year:
2008

Citing 12
Cited 1

Robust regression and outlier detection

Robust regression and outlier detection
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Deviants in a Time Series Database

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Scalable and practical probability density estimators for scientific anomaly detection

Scalable and practical probability density estimators for scientific anomaly detection
Example-Based Robust Outlier Detection in High Dimensional Datasets

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Detecting anomalous records in categorical datasets

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining

Outlier detection in transactional data

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Outlier detection, a data mining technique to detect rare events, deviant objects, and exceptions from data, has been drawing increasing attention in recent years. Most existing outlier detection algorithms focus on numerical data sets. We target categorical record databases and detect records in which many attribute values are not observed even though they should occur in association with other attribute values in the records. To detect such records as outliers, we provide an outlier degree, which demonstrates sufficient detection performance in accuracy-evaluation experiments compared with the probabilistic approach used in a related work. We also propose an efficient algorithm for detecting such outlier records. Experiments using real data sets show that our method detects interesting records as outliers.