Haplotype inference via hierarchical genotype parsing

  • Authors:
  • Pasi Rastas;Esko Ukkonen

  • Affiliations:
  • Department of Computer Science and Helsinki Institute for Information Technology, University of Helsinki, Finland;Department of Computer Science and Helsinki Institute for Information Technology, University of Helsinki, Finland

  • Venue:
  • WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The within-species genetic variation due to recombinations leads to a mosaic-like structure of DNA. This structure can be modeled, e.g. by parsing sample sequences of current DNA with respect to a small number of founders. The founders represent the ancestral sequence material from which the sample was created in a sequence of recombination steps. This scenario has recently been successfully applied on developing probabilistic Hidden Markov Methods for haplotyping genotypic data. In this paper we introduce a combinatorial method for haplotyping that is based on a similar parsing idea. We formulate a polynomial-time parsing algorithm that finds minimum cross-over parse in a simplified 'flat' parsing model that ignores the historical hierarchy of recombinations. The problem of constructing optimal founders that would give minimum possible parse for given genotypic sequences is shown NP-hard. A heuristic locally-optimal algorithm is given for founder construction. Combined with flat parsing this already gives quite good haplotyping results. Improved haplotyping is obtained by using a hierarchical parsing that properly models the natural recombination process. For finding short hierarchical parses a greedy polynomial-time algorithm is given. Empirical haplotyping results on HapMap data are reported.