Efficient Inference of Haplotypes from Genotypes on a Pedigree with Mutations and Missing Alleles (Extented Abstract)

  • Authors:
  • Wei-Bung Wang;Tao Jiang

  • Affiliations:
  • Department of Computer Science, University of California, Riverside, USA CA 92506;Department of Computer Science, University of California, Riverside, USA CA 92506

  • Venue:
  • CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Driven by the international HapMap project, the haplotype inference problem has become an important topic in the computational biology community. In this paper, we study how to efficiently infer haplotypes from genotypes of related individuals as given by a pedigree. Our assumption is that the input pedigree data may contain de novo mutations and missing alleles but is free of genotyping errors and recombinants, which is usually true for tightly linked markers. We formulate the problem as a combinatorial optimization problem, called the minimum mutation haplotype configuration (MMHC) problem, where we seek haplotypes consistent with the given genotypes that incur no recombinants and require the minimum number of mutations. This extends the well studied zero-recombinant haplotype configuration (ZRHC) problem. Although ZRHC is polynomial-time solvable, MMHC is NP-hard. We construct an integer linear program (ILP) for MMHC using the system of linear equations over the field F (2) that has been developed recently to solve ZRHC. Since the number of constraints in the ILP is large (exponentially large in the general case), we present an incremental approach for solving the ILP where we gradually add the constraints to a standard ILP solver until a feasible haplotype configuration is found. Our preliminary experiments on simulated data demonstrate that the method is very efficient on large pedigrees and can infer haplotypes very accurately as well as recover most of the mutations and missing alleles correctly.