Discovering pattern-based subspace clusters by pattern tree

  • Authors:
  • Jihong Guan;Yanglan Gan;Hao Wang

  • Affiliations:
  • Department of Computer Science and Technology, Tongji University, Shanghai 201804, China;Department of Computer Science and Technology, Tongji University, Shanghai 201804, China;Department of Computer Science and Technology, Hefei University of Technology, Hefei 23009, China

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional clustering models based on distance similarity are not always effective in capturing correlation among data objects, while pattern-based clustering can do well in identifying correlation hidden among data objects. However, the state-of-the-art pattern-based clustering methods are inefficient and provide no metric to measure the clustering quality. This paper presents a new pattern-based subspace clustering method, which can tackle the problems mentioned above. Observing the analogy between mining frequent itemsets and discovering subspace clusters, we apply pattern tree - a structure used in frequent itemsets mining to determining the target subspaces by scanning the database once, which can be done efficiently in large datasets. Furthermore, we introduce a general clustering quality evaluation model to guide the identifying of meaningful clusters. The proposed new method enables the users to set flexibly proper quality-control parameters to meet different needs. Experimental results on synthetic and real datasets show that our method outperforms the existing methods in both efficiency and effectiveness.