An exact and polynomial distance-based algorithm to reconstruct single copy tandem duplication trees

Authors:
Olivier Elemento;Olivier Gascuel
Affiliations:
Département d'Informatique Fondamentale et Applications, LIRMM, Montpellier, France and Int. ImMunoGeneTics database, Lab. d'Immunogénétique Moléculaire, LIGM, Université ...;Département d'Informatique Fondamentale et Applications, LIRMM, Montpellier, France
Venue:
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Year:
2003

Citing 6
Cited 1

Computational recreations in Mathematica

Computational recreations in Mathematica
Improved approximation algorithms for tree alignment

Journal of Algorithms
Zinc finger gene clusters and tandem gene duplication

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Methods for reconstructing the history of tandem repeats and their application to the human genome

Journal of Computer and System Sciences - Computational biology 2002
Reconstructing the Duplication History of a Tandem Repeat

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
On the consistency of the minimum evolution principle of phylogenetic inference

Discrete Applied Mathematics - Special issue: Computational molecular biology series issue IV

Topological Rearrangements and Local Search Method for Tandem Duplication Trees

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of reconstructing the duplication tree of a set of tandemly repeated sequences which are supposed to have arisen by unequal recombination, was first introduced by Fitch (1977), and has recently received a lot of attention. In this paper, we deal with the restricted problem of reconstructing single copy duplication trees. We describe an exact and polynomial distance based algorithm for solving this problem, the parsimony version of which has previously been shown to be NP-hard (like most evolutionary tree reconstruction problems). This algorithm is based on the minimum evolution principle, and thus involves selecting the shortest tree as being the correct duplication tree. After presenting the underlying mathematical concepts behind the minimum evolution principle, and some of its benefits (such as consistency), we provide a new recurrence equation to estimate the tree length using ordinary least-squares, given a matrix of pairwise distances between the copies. We then show how this equation naturally forms the dynamic programming framework on which our algorithm is based, and provide an implementation in O(n3) time and O(n2) space, where n is the number of copies.