Influence of Tree Topology Restrictions on the Complexity of Haplotyping with Missing Data

  • Authors:
  • Michael Elberfeld;Ilka Schnoor;Till Tantau

  • Affiliations:
  • Institut für Theoretische Informatik, Universität zu Lübeck, Lübeck, Germany 23538;Institut für Theoretische Informatik, Universität zu Lübeck, Lübeck, Germany 23538;Institut für Theoretische Informatik, Universität zu Lübeck, Lübeck, Germany 23538

  • Venue:
  • TAMC '09 Proceedings of the 6th Annual Conference on Theory and Applications of Models of Computation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Haplotyping, also known as haplotype phase prediction, is the problem of predicting likely haplotypes from genotype data. One fast haplotyping method is based on an evolutionary model where a perfect phylogenetic tree is sought that explains the observed data. Unfortunately, when data entries are missing as is often the case in laboratory data, the resulting incomplete perfect phylogeny haplotyping problem ipph is NP-complete and no theoretical results are known concerning its approximability, fixed-parameter tractability, or exact algorithms for it. Even radically simplified versions, such as the restriction to phylogenetic trees consisting of just two directed paths from a given root, are still NP-complete; but here a fixed-parameter algorithm is known. We show that such drastic and ad hoc simplifications are not necessary to make ipph fixed-parameter tractable: We present the first theoretical analysis of an algorithm, which we develop in the course of the paper, that works for arbitrary instances of ipph . On the negative side we show that restricting the topology of perfect phylogenies does not always reduce the computational complexity: while the incomplete directed perfect phylogeny problem is well-known to be solvable in polynomial time, we show that the same problem restricted to path topologies is NP-complete.