New algorithms for the duplication-loss model

  • Authors:
  • M. T. Hallett;J. Lagergren

  • Affiliations:
  • Computational Biochemistry Research Group, Dept. of Computer Science, ETH Zürich, Zürich, Switzerland;Dept. of Numerical Analysis of Computing Science, KTH, Stockholm, Sweden

  • Venue:
  • RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of constructing a species tree given a number of gene trees. In the frameworks introduced by Goodman et al. [3], Page [10], and Guigó, Muchnik, and Smith [5] this is formulated as an optimization problem; namely, that of finding the species tree requiring the minimum number of duplications and/ or losses in order to explain the gene trees.In this paper, we introduce the WIDTH k DUPLICATION-LOSS and WIDTH k DUPLICATION problems. A gene tree has width k w.r.t. a species tree, if the species tree can be reconciled with the gene tree using at most k simultaneously active copies of the gene along its branches. We explain w.r.t. to the underlying biological model, why this width is typically very small in comparison to the total number of duplications and losses. We show polynomial time algorithms for finding optimal species trees having bounded width w.r.t. at least one of the input gene trees. Furthermore, we present the first algorithm for input gene trees that are unrooted. Lastly, we apply our algorithms to a dataset from [5] and show a species tree requiring significantly fewer duplications and fewer duplications/losses than the trees given in the original paper.