Maximum likelihood resolution of multi-block genotypes
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Bayesian haplo-type inference via the dirichlet process
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Nonparametric combinatorial sequence models
RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
A non-parametric visual-sense model of images--extending the cluster hypothesis beyond text
Multimedia Tools and Applications
Hi-index | 0.00 |
Uncovering the haplotypes of single nucleotide polymorphisms and their population demography is essential for many biological and medical applications. Methods for haplotype inference developed thus far---including methods based on coalescence, finite and infinite mixtures, and maximal parsimony---ignore the underlying population structure in the genotype data. As noted by Pritchard (2001), different populations can share certain portion of their genetic ancestors, as well as have their own genetic components through migration and diversification. In this paper, we address the problem of multi-population haplotype inference. We capture cross-population structure using a nonparametric Bayesian prior known as the hierarchical Dirichlet process (HDP) (Teh et al., 2006), conjoining this prior with a recently developed Bayesian methodology for haplotype phasing known as DP-Haplotyper (Xing et al., 2004). We also develop an efficient sampling algorithm for the HDP based on a two-level nested Pólya urn scheme. We show that our model outperforms extant algorithms on both simulated and real biological data.