Training knowledge-based neural networks to recognize genes in DNA sequences
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
ACM Computing Surveys (CSUR)
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
Entropy-based criterion in categorical clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Automated Variable Weighting in k-Means Type Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Journal of Classification
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
IEEE Transactions on Knowledge and Data Engineering
Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data
IEEE Transactions on Knowledge and Data Engineering
On convergence properties of the em algorithm for gaussian mixtures
Neural Computation
Refinement of approximate domain theories by knowledge-based neural networks
AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 2
DHCC: Divisive hierarchical clustering of categorical data
Data Mining and Knowledge Discovery
Model-Based Method for Projective Clustering
IEEE Transactions on Knowledge and Data Engineering
Hi-index | 0.00 |
The ability to cluster high-dimensional categorical data is essential for many machine learning applications such as bioinfomatics. Currently, central clustering of categorical data is a difficult problem due to the lack of a geometrically interpretable definition of a cluster center. In this paper, we propose a novel kernel-density-based definition using a Bayes-type probability estimator. Then, a new algorithm called k-centers is proposed for central clustering of categorical data, incorporating a new feature weighting scheme by which each attribute is automatically assigned with a weight measuring its individual contribution for the clusters. Experimental results on real-world data show outstanding performance of the proposed algorithm, especially in recognizing the biological patterns in DNA sequences.