A concept lattice based outlier mining method in low-dimensional subspaces

  • Authors:
  • Jifu Zhang;Yiyong Jiang;Kai H. Chang;Sulan Zhang;Jianghui Cai;Lihua Hu

  • Affiliations:
  • School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, PR China;School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, PR China;Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849-5347, USA;School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, PR China;School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, PR China;School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan 030024, PR China

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2009

Quantified Score

Hi-index 0.10

Visualization

Abstract

Traditional outlier mining methods identify outliers from a global point of view. It is usually difficult to find deviated data points in low-dimensional subspaces using these methods. The concept lattice, due to its straight-forwardness, conciseness and completeness in knowledge expression, has become an effective tool for data analysis and knowledge discovery. In this paper, a concept lattice based outlier mining algorithm (CLOM) for low-dimensional subspaces is proposed, which treats the intent of every concept lattice node as a subspace. First, sparsity and density coefficients, which measure outliers in low-dimensional subspaces, are defined and discussed. Second, the intent of a concept lattice node is regarded as a subspace, and sparsity subspaces are identified based on a predefined sparsity coefficient threshold. At this stage, whether the intent of any ancestor node of a sparsity subspace is a density subspace is identified based on a predefined density coefficient threshold. If it is a density subspace, then the objects in the extent of the node whose intent is a sparsity subspace are defined as outliers. Experimental results on a star spectral database show that CLOM is effective in mining outliers in low-dimensional subspaces. The accuracy of the results is also greatly improved.