Projective clustering using itemset discovery for multi-dimensional data analysis

  • Authors:
  • Muhammad Umer Arshad;Muhammad Naeem Ayyaz

  • Affiliations:
  • Department of Electrical Engineering, University of Engineering and Technology, Lahore, Pakistan;Department of Electrical Engineering, University of Engineering and Technology, Lahore, Pakistan

  • Venue:
  • MS'06 Proceedings of the 17th IASTED international conference on Modelling and simulation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering, also known as unsupervised classification, aims at grouping data such that intra-group distances are minimized and inter-group distances are maximized. Most of the clustering algorithms use full dimensions of the feature/attribute space for partitioning objects into different groups. However, recent research suggests that clustering for high-dimensional spaces should search for hidden subspaces with lower dimensionalities, because it is more likely for data to form dense clusters in a high-dimensional subspace. In this paper, we present a new, fast, and scalable clustering algorithm, ProjClusID, for the projective clustering problem. We use the concept of frequent itemset mining to find projective clusters. For this, we use discretization to map data from continuous to discrete domain. Our algorithm is density-based and grid-based and finds the potential optimum clustering without requiring any parameter input. As a post-clustering step, the data is mapped back to its original continuous domain. Our experimental results on synthetic and real datasets show that ProjClusID algorithm improves on the accuracy and effectiveness of the previous techniques.