Complexity control in a mixture model by the Hardy-Weinberg equilibrium

  • Authors:
  • Ella Bingham;Heikki Mannila

  • Affiliations:
  • Helsinki Institute for Information Technology, University of Helsinki, P.O. Box 68, FIN-00014 Helsinki, Finland;Helsinki Institute for Information Technology, University of Helsinki, P.O. Box 68, FIN-00014 Helsinki, Finland and Helsinki Institute for Information Technology, Helsinki University of Technology ...

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.03

Visualization

Abstract

A method of complexity control in multinomial mixture modeling of multiple-marker genotype data, imposing the Hardy-Weinberg equilibrium (HWE) between the genotype values, is studied. This is a very natural restriction, and known to hold at population level under modest assumptions. The hypothesis under study is that imposing this restriction will prevent overfitting and lead to a better model. This is shown to indeed be case. Experimental results on chromosomes 1 and 17 of the HapMap data demonstrate that the restricted model generalizes better to unseen data, and also finds clusters that correspond better to the ethnic groups of the HapMap, when compared with a model without the HWE restriction.