Detecting outliers in categorical record databases based on attribute associations

  • Authors:
  • Kazuyo Narita;Hiroyuki Kitagawa

  • Affiliations:
  • Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba, Ibaraki, Japan;Graduate School of Systems and Information Engineering, Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan

  • Venue:
  • APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Outlier detection, a data mining technique to detect rare events, deviant objects, and exceptions from data, has been drawing increasing attention in recent years. Most existing outlier detection algorithms focus on numerical data sets. We target categorical record databases and detect records in which many attribute values are not observed even though they should occur in association with other attribute values in the records. To detect such records as outliers, we provide an outlier degree, which demonstrates sufficient detection performance in accuracy-evaluation experiments compared with the probabilistic approach used in a related work. We also propose an efficient algorithm for detecting such outlier records. Experiments using real data sets show that our method detects interesting records as outliers.