Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering transactions using large items
Proceedings of the eighth international conference on Information and knowledge management
ACM Computing Surveys (CSUR)
COOLCAT: an entropy-based algorithm for categorical clustering
Proceedings of the eleventh international conference on Information and knowledge management
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
Techniques of Cluster Algorithms in Data Mining
Data Mining and Knowledge Discovery
CLOPE: a fast and effective clustering algorithm for transactional data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Cluster merging and splitting in hierarchical clustering algorithms
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Towards parameter-free data mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Entropy-based criterion in categorical clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Determining the Number of Clusters/Segments in Hierarchical Clustering/Segmentation Algorithms
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Subspace clustering for high dimensional categorical data
ACM SIGKDD Explorations Newsletter
Automated Variable Weighting in k-Means Type Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
The "Best K" for entropy-based categorical data clustering
SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Efficient multidimensional data representations based on multiple correspondence analysis
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently clustering transactional data with weighted coverage density
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Detecting anomalous records in categorical datasets
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data
IEEE Transactions on Knowledge and Data Engineering
Categorical Data Clustering Using the Combinations of Attribute Values
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
On Data Labeling for Clustering Categorical Data
IEEE Transactions on Knowledge and Data Engineering
Mining Projected Clusters in High-Dimensional Spaces
IEEE Transactions on Knowledge and Data Engineering
Data discretization unification
Knowledge and Information Systems
Cluster Analysis
Adapting the right measures for K-means clustering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A New MCA-Based Divisive Hierarchical Algorithm for Clustering Categorical Data
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
An experimental evaluation of a Monte-Carlo algorithm for singular value decomposition
PCI'01 Proceedings of the 8th Panhellenic conference on Informatics
Determining the number of clusters using information entropy for mixed data
Pattern Recognition
Central clustering of categorical data with automated feature weighting
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Hi-index | 0.00 |
Clustering categorical data poses two challenges defining an inherently meaningful similarity measure, and effectively dealing with clusters which are often embedded in different subspaces. In this paper, we propose a novel divisive hierarchical clustering algorithm for categorical data, named DHCC. We view the task of clustering categorical data from an optimization perspective, and propose effective procedures to initialize and refine the splitting of clusters. The initialization of the splitting is based on multiple correspondence analysis (MCA). We also devise a strategy for deciding when to terminate the splitting process. The proposed algorithm has five merits. First, due to its hierarchical nature, our algorithm yields a dendrogram representing nested groupings of patterns and similarity levels at different granularities. Second, it is parameter-free, fully automatic and, in particular, requires no assumption regarding the number of clusters. Third, it is independent of the order in which the data is processed. Fourth, it is scalable to large data sets. And finally, our algorithm is capable of seamlessly discovering clusters embedded in subspaces, thanks to its use of a novel data representation and Chi-square dissimilarity measures. Experiments on both synthetic and real data demonstrate the superior performance of our algorithm.