Symbolic clustering using a new dissimilarity measure
Pattern Recognition
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering transactions using large items
Proceedings of the eighth international conference on Information and knowledge management
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
COOLCAT: an entropy-based algorithm for categorical clustering
Proceedings of the eleventh international conference on Information and knowledge management
An iterative initial-points refinement algorithm for categorical data clustering
Pattern Recognition Letters
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
Discretization: An Enabling Technique
Data Mining and Knowledge Discovery
Squeezer: an efficient algorithm for clustering categorical data
Journal of Computer Science and Technology
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Clustering Categorical Data: An Approach Based on Dynamical Systems
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
An Efficient Clustering Algorithm for Market Basket Data Based on Small Large Ratios
COMPSAC '01 Proceedings of the 25th International Computer Software and Applications Conference on Invigorating Software Development
Clustering Large Categorical Data
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
CLOPE: a fast and effective clustering algorithm for transactional data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Cluster ensembles: a knowledge reuse framework for combining partitionings
Eighteenth national conference on Artificial intelligence
Caucus-based Transaction Clustering
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Using Category-Based Adherence to Cluster Market-Basket Data
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Clustering and its validation in a symbolic framework
Pattern Recognition Letters
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Entropy-based criterion in categorical clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Categorical data visualization and clustering using subjective factors
Data & Knowledge Engineering
TCSOM: Clustering Transactions Using Self-Organizing Map
Neural Processing Letters
A fuzzy k-modes algorithm for clustering categorical data
IEEE Transactions on Fuzzy Systems
G-ANMI: A mutual information based genetic clustering algorithm for categorical data
Knowledge-Based Systems
The Journal of Machine Learning Research
CPCQ: Contrast pattern based clustering quality index for categorical data
Pattern Recognition
Adjusting the clustering results referencing an external set
ICSI'10 Proceedings of the First international conference on Advances in Swarm Intelligence - Volume Part II
An automatic approach for ontology-based feature extraction from heterogeneous textualresources
Engineering Applications of Artificial Intelligence
An improved genetic clustering algorithm for categorical data
PAKDD'12 Proceedings of the 2012 Pacific-Asia conference on Emerging Trends in Knowledge Discovery and Data Mining
Hi-index | 0.00 |
Clustering categorical data is an integral part of data mining and has attracted much attention recently. In this paper, we present k-ANMI, a new efficient algorithm for clustering categorical data. The k-ANMI algorithm works in a way that is similar to the popular k-means algorithm, and the goodness of clustering in each step is evaluated using a mutual information based criterion (namely, average normalized mutual information - ANMI) borrowed from cluster ensemble. This algorithm is easy to implement, requiring multiple hash tables as the only major data structure. Experimental results on real datasets show that k-ANMI algorithm is competitive with those state-of-the-art categorical data clustering algorithms with respect to clustering accuracy.