Algorithms for clustering data
Algorithms for clustering data
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
CACTUS—clustering categorical data using summaries
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
COOLCAT: an entropy-based algorithm for categorical clustering
Proceedings of the eleventh international conference on Information and knowledge management
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
On Clustering Validation Techniques
Journal of Intelligent Information Systems
Diffusion Kernels on Graphs and Other Discrete Input Spaces
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Clustering Categorical Data: An Approach Based on Dynamical Systems
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
A Large Scale Clustering Scheme for Kernel K-Means
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
Text classification using string kernels
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Mercer kernel-based clustering in feature space
IEEE Transactions on Neural Networks
SpectralCAT: Categorical spectral clustering of numerical and nominal data
Pattern Recognition
Similarity kernels for nearest neighbor-based outlier detection
IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
Semantic Pattern Transformation: Applying Knowledge Discovery Processes in Heterogeneous Domains
Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies
Hi-index | 0.01 |
Clustering categorical data is an important and challenging data analysis task. In this paper, we explore the use of kernel K-means to cluster categorical data. We propose a new kernel function based on Hamming distance to embed categorical data in a constructed feature space where the clustering is conducted. We experimentally evaluated the quality of the solutions produced by kernel K-means on real datasets. Results indicated the feasibility of kernel K-means using our proposed kernel function to discover clusters embedded in categorical data.