On the Multiple Gene Duplication Problem

  • Authors:
  • Michael R. Fellows;Michael T. Hallet;Ulrike Stege

  • Affiliations:
  • -;-;-

  • Venue:
  • ISAAC '98 Proceedings of the 9th International Symposium on Algorithms and Computation
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

A fundamental problem in computational biology is the determination of the correct species tree for a set of taxa given a set of (possibly contradictory) gene trees. In recent literature, the DUPLICATION/ LOSS model has received considerable attention. Here one measures the similarity/dissimilarity between a set of gene trees by counting the number of paralogous gene duplications and subsequent gene losses which need to be postulated in order to explain (in an evolutionarily meaningful way) how the gene trees could have arisen with respect to the species tree. Here we count the number of multiple gene duplication events (duplication events in the genome of the organism involving one or more genes) without regard to gene losses. MULTIPLE GENE DUPLICATION asks to find the species tree S which requires the fewest number of multiple gene duplication events to be postulated in order to explain a set of gene trees G1, G2,..., Gk. We also examine the related problem which assumes the species tree S is known and asks to find the explanation for G1, G2,..., Gk requiring the fewest multiple gene duplications. Via a reduction to and from a combinatorial model we call the BALL AND TRAP GAME, we show that the general form of this problem is NP-hard and various parameterized versions are hard for the complexity class W[1]. These results immediately imply that MULTIPLE GENE DUPLICATION is similarily hard. We prove that several parameterized variants are in FPT.