CLUC: a natural clustering algorithm for categorical datasets based on cohesion

  • Authors:
  • Aida Nemalhabib;Nematollaah Shiri

  • Affiliations:
  • Concordia University, Montreal, Quebec, Canada;Concordia University, Montreal, Quebec, Canada

  • Venue:
  • Proceedings of the 2006 ACM symposium on Applied computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a clustering algorithm for categorical datasets, called CLUC (CLUstering with Cohesion), which uses a novel similarity measure, called cohesion, to determine the degree with which items/objects stick to clusters. We have implemented CLUC and carried out extensive experiments on real-life and synthetic datasets. The results of experiments and their analyses indicate that CLUC generates high quality clusters in that they conform to expert's opinion. Our experiments on large synthetic data confirm that CLUC is scalable when the dataset grows in the number of objects and/or dimensions. We also repeated the experiments with different orders of the items in the datasets. The results show that the proposed algorithm is order insensitive