An Experimental Comparison of Model-Based Clustering Methods

Authors:
Marina Meilă;David Heckerman
Affiliations:
Microsoft Research, Redmond, WA 98052, USA;Microsoft Research, Redmond, WA 98052, USA. heckerma@microsoft.com
Venue:
Machine Learning
Year:
2001

Citing 7
Cited 5

Algorithms for clustering data

Algorithms for clustering data
A Classification EM algorithm for clustering and two stochastic versions

Computational Statistics & Data Analysis - Special issue on optimization techniques in statistics
Bayesian classification (AutoClass): theory and results

Advances in knowledge discovery and data mining
Efficient Approximations for the MarginalLikelihood of Bayesian Networks with Hidden Variables

Machine Learning - Special issue on learning with probabilistic representations
Algorithms for Model-Based Gaussian Hierarchical Clustering

SIAM Journal on Scientific Computing
Iterative optimization and simplification of hierarchical clusterings

Journal of Artificial Intelligence Research
Update rules for parameter estimation in Bayesian networks

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Alternatives to the k-means algorithm that find better clusterings

Proceedings of the eleventh international conference on Information and knowledge management
Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models

Computational Statistics & Data Analysis
A comparative assessment of classification methods

Decision Support Systems
Empirical and Theoretical Comparisons of Selected Criterion Functions for Document Clustering

Machine Learning
Automatic construction of multifaceted browsing interfaces

Proceedings of the 14th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

We compare the three basic algorithms for model-based clustering on high-dimensional discrete-variable datasets. All three algorithms use the same underlying model: a naive-Bayes model with a hidden root node, also known as a multinomial-mixture model. In the first part of the paper, we perform an experimental comparison between three batch algorithms that learn the parameters of this model: the Expectation–Maximization (EM) algorithm, a “winner take all” version of the EM algorithm reminiscent of the K-means algorithm, and model-based agglomerative clustering. We find that the EM algorithm significantly outperforms the other methods, and proceed to investigate the effect of various initialization methods on the final solution produced by the EM algorithm. The initializations that we consider are (1) parameters sampled from an uninformative prior, (2) random perturbations of the marginal distribution of the data, and (3) the output of agglomerative clustering. Although the methods are substantially different, they lead to learned models that are similar in quality.