Ancestry inference in complex admixtures via variable-length markov chain linkage models

  • Authors:
  • Sivan Bercovici;Jesse M. Rodriguez;Megan Elmore;Serafim Batzoglou

  • Affiliations:
  • Department of Computer Science, Stanford University;Department of Computer Science, Stanford University, USA and Biomedical Informatics Program, Stanford University;Department of Computer Science, Stanford University;Department of Computer Science, Stanford University

  • Venue:
  • RECOMB'12 Proceedings of the 16th Annual international conference on Research in Computational Molecular Biology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Inferring the ancestral origin of chromosomal segments in admixed individuals is key for genetic applications, ranging from analyzing population demographics and history, to mapping disease genes. Previous methods addressed ancestry inference by using either weak models of linkage disequilibrium, or large models that make explicit use of ancestral haplotypes. In this paper we introduce ALLOY, an efficient method that incorporates generalized, but highly expressive, linkage disequilibrium models. ALLOY applies a factorial hidden Markov model to capture the parallel process producing the maternal and paternal admixed haplotypes, and models the background linkage disequilibrium in the ancestral populations via an inhomogeneous variable-length Markov chain. We test ALLOY in a broad range of scenarios ranging from recent to ancient admixtures with up to four ancestral populations. We show that ALLOY outperforms the previous state of the art, and is robust to uncertainties in model parameters.