Empirical exploration of perfect phylogeny haplotyping and haplotypers

  • Authors:
  • Ren Hua Chung;Dan Gusfield

  • Affiliations:
  • Computer Science Department, University of California, Davis, Davis, CA;Computer Science Department, University of California, Davis, Davis, CA

  • Venue:
  • COCOON'03 Proceedings of the 9th annual international conference on Computing and combinatorics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The next high-priority phase of human genomics will involve the development of a full Haplotype Map of the human genome [15]. It will be used in large-scale screens of populations to associate specific haplotypes with specific complex genetic-influenced diseases. A key, perhaps bottleneck, problem is to computationally determine haplotype pairs from genotype data. An approach to this problem based on viewing it in the context of perfect phylogeny was introduced in [14] along with an efficient solution. A slower (in worst case) variation of that method was implemented [3]. Two simpler methods for the perfect phylogeny approach that are also slower (in worst case) than the first algorithm were later developed [1,7]. We have implemented and tested all three of these approachs in order to compare and explain the practical efficiencies of the three methods. We discuss two other empirical observations: a strong phase-transition in the frequency of obtaining a unique solution as a function of the number of individuals in the input; and results of using the method to find non-overlapping intervals where the haplotyping solution is highly reliable, as a function of the level of recombination in the data. Finally, we discuss the biological basis for the size of these tests.