An experimental comparison of model-based clustering methods
Machine Learning
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Model selection for probabilistic clustering using cross-validatedlikelihood
Statistics and Computing
Variational Extensions to EM and Multinomial PCA
ECML '02 Proceedings of the 13th European Conference on Machine Learning
The Journal of Machine Learning Research
Bayesian analysis of finite mixtures of multinomial and negative-multinomial distributions
Computational Statistics & Data Analysis
A fast implementation of the EM algorithm for mixture of multinomials
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Computational Statistics & Data Analysis
Hi-index | 0.03 |
A method of complexity control in multinomial mixture modeling of multiple-marker genotype data, imposing the Hardy-Weinberg equilibrium (HWE) between the genotype values, is studied. This is a very natural restriction, and known to hold at population level under modest assumptions. The hypothesis under study is that imposing this restriction will prevent overfitting and lead to a better model. This is shown to indeed be case. Experimental results on chromosomes 1 and 17 of the HapMap data demonstrate that the restricted model generalizes better to unseen data, and also finds clusters that correspond better to the ethnic groups of the HapMap, when compared with a model without the HWE restriction.