Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
COOLCAT: an entropy-based algorithm for categorical clustering
Proceedings of the eleventh international conference on Information and knowledge management
CLOPE: a fast and effective clustering algorithm for transactional data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
CLICKS: Mining Subspace Clusters in Categorical Data via K-Partite Maximal Cliques
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Hi-index | 0.00 |
We propose a clustering algorithm for categorical datasets, called CLUC (CLUstering with Cohesion), which uses a novel similarity measure, called cohesion, to determine the degree with which items/objects stick to clusters. We have implemented CLUC and carried out extensive experiments on real-life and synthetic datasets. The results of experiments and their analyses indicate that CLUC generates high quality clusters in that they conform to expert's opinion. Our experiments on large synthetic data confirm that CLUC is scalable when the dataset grows in the number of objects and/or dimensions. We also repeated the experiments with different orders of the items in the datasets. The results show that the proposed algorithm is order insensitive