Efficient and accurate haplotype inference by combining parsimony and pedigree information

  • Authors:
  • Ana Graça;Inês Lynce;João Marques-Silva;Arlindo L. Oliveira

  • Affiliations:
  • INESC-ID/IST, Technical University of Lisbon, Portugal;INESC-ID/IST, Technical University of Lisbon, Portugal;CSI/CASL, University College Dublin, Ireland;INESC-ID/IST, Technical University of Lisbon, Portugal

  • Venue:
  • ANB'10 Proceedings of the 4th international conference on Algebraic and Numeric Biology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing genotyping technologies have enabled researchers to genotype hundreds of thousands of SNPs efficiently and inexpensively. Methods for the imputation of non-genotyped SNPs and the inference of haplotype information from genotypes, however, remain important, since they have the potential to increase the power of statistical association tests. In many cases, studies are conducted in sets of individuals where the pedigree information is relevant, and can be used to increase the power of tests and to decrease the impact of population structure on the obtained results. This paper proposes a new Boolean optimization model for haplotype inference combining two combinatorial approaches: the Minimum Recombinant Haplotyping Configuration (MRHC), which minimizes the number of recombinant events within a pedigree, and the Haplotype Inference by Pure Parsimony (HIPP), that aims at finding a solution with a minimum number of distinct haplotypes within a population. The paper also describes the use of well-known techniques, which yield significant performance gains. Concrete examples include symmetry breaking, identification of lower bounds, and the use of an appropriate constraint solver. Experimental results show that the new PedRPoly model is competitive both in terms of accuracy and efficiency.