Introduction to the theory of neural computation
Introduction to the theory of neural computation
Elements of information theory
Elements of information theory
Fundamentals of speech recognition
Fundamentals of speech recognition
The EM algorithm for graphical association models with missing data
Computational Statistics & Data Analysis - Special issue dedicated to Toma´sˇ Havra´nek
Clustering techniques for large data sets—from the past to the future
KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Alternatives to the k-means algorithm that find better clusterings
Proceedings of the eleventh international conference on Information and knowledge management
PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Performance Guarantees for Hierarchical Clustering
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Data mining tasks and methods: Clustering: conceptual clustering
Handbook of data mining and knowledge discovery
A unified framework for model-based clustering
The Journal of Machine Learning Research
Journal of the ACM (JACM)
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
An information theoretic analysis of maximum likelihood mixture estimation for exponential families
ICML '04 Proceedings of the twenty-first international conference on Machine learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians
The Journal of Machine Learning Research
A probabilistic framework for relational clustering
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Bregman bubble clustering: A robust framework for mining dense clusters
ACM Transactions on Knowledge Discovery from Data (TKDD)
Boosting for Model-Based Data Clustering
Proceedings of the 30th DAGM symposium on Pattern Recognition
Finding cohesive clusters for analyzing knowledge communities
Knowledge and Information Systems
Analyzing seller practices in a Brazilian marketplace
Proceedings of the 18th international conference on World wide web
Analyzing knowledge communities using foreground and background clusters
ACM Transactions on Knowledge Discovery from Data (TKDD)
Discovering clusters in motion time-series data
CVPR'03 Proceedings of the 2003 IEEE computer society conference on Computer vision and pattern recognition
Reactivity-based approaches to improve web systems' quality of service
Journal of Web Engineering
A two-round variant of EM for Gaussian mixtures
UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Clustering with a semantic criterion based on dimensionality analysis
ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Boosting GMM and its two applications
MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Lateen EM: unsupervised training with multiple objectives, applied to dependency grammar induction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Noise-enhanced clustering and competitive learning algorithms
Neural Networks
Hi-index | 0.00 |
Assignment methods are at the heart of many algorithms for unsupervised learning and clustering -- in particular, the well-known K-means and Expectation-Maximizatian (EM) algorithms. In this work, we study several different methods of assignment, including the "hard" assignments used by K-means and the "soft" assignments used by EM. While it is known that K-means minimizes the distortion on the data and EM maximizes the likelihood, little is known about the systematic differences of behavior between the two algorithms. Here we shed light on these differences via an information-theoretic analysis. The cornerstone of our results is a simple decomposition of the expected distortion, showing that K-means (and its extension for inferring general parametric densities from unlabeled sample data) must implicitly manage a trade-off between how similar the data assigned to each cluster are, and how the data are balanced among the clusters. How well the data are balanced is measured by the entropy of the partition defined by the hard assignments. In addition to letting us predict and verify systematic differences between K-means and EM on specific examples, the decomposition allows us to give a rather general argument showing that K-means will consistently find densities with less "overlap" than EM. We also study a third natural assignment method that we call posterior assignment, that is close in spirit to the soft assignments of EM, but leads to a surprisingly different algorithm.