A model for mining outliers from complex data sets

  • Authors:
  • Hongwei Qi;Jue Wang

  • Affiliations:
  • Institute of Automation, Chinese Academy of Sciences, Beijing, China;Institute of Automation, Chinese Academy of Sciences, Beijing, China

  • Venue:
  • Proceedings of the 2004 ACM symposium on Applied computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

To solve the outlier mining problems where outliers are highly intermixed with normal data, a general Variance-based Outlier Mining Model (VOMM) is presented, in which the information of data is decomposed into normal and abnormal components according to their variances. With minimal loss of normal information in the VOMM, outliers are viewed as the top k samples holding maximal abnormal information in a dataset. And then, the principal curve that is a smooth nonparametric curve passing through the "middle" of the dataset and that provides a good nonlinear summary of the data is introduced as an algorithm of the VOMM. Experiments carried out on abnormal returns detection in stock market show that the VOMM is feasible and performs better than that of Gaussian model and GARCH (Generalized Auto-Regressive Conditional Heteroscedasticity) model.