Fast and reliable reconstruction of phylogenetic trees with indistinguishable edges

Authors:
Ilan Gronau;Shlomo Moran;Sagi Snir
Affiliations:
Department of Computer Science, Technion - Israel Institute of Technology, Haifa, 32000 Israel;Department of Computer Science, Technion - Israel Institute of Technology, Haifa, 32000 Israel;Department of Evolutionary and Environmental Biology and The Institute of Evolution, University of Haifa Mount Carmel, Haifa 31905 Israel
Venue:
Random Structures & Algorithms
Year:
2012

Citing 12
Cited 0

A fast algorithm for constructing trees from distance matrices

Information Processing Letters
Absolute convergence: true trees from short sequences

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Introduction to algorithms

Introduction to algorithms
Evolutionary Trees Can be Learned in Polynomial Time in the Two-State General Markov Model

SIAM Journal on Computing
Inverting Random Functions II: Explicit Bounds for Discrete Maximum Likelihood Estimation, with Applications

SIAM Journal on Discrete Mathematics
On the complexity of distance-based evolutionary tree reconstruction

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal phylogenetic reconstruction

Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Distorted Metrics on Trees and Phylogenetic Forests

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Fast and reliable reconstruction of phylogenetic trees with very short edges

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Sequence Length Requirement of Distance-Based Phylogeny Reconstruction: Breaking the Polynomial Barrier

FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
Phylogenies without Branch Bounds: Contracting the Short, Pruning the Deep

RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Maximal accurate forests from distance matrices

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Phylogenetic reconstruction methods attempt to reconstruct a tree describing the evolution of a given set of species using sequences of characters (e.g. DNA) extracted from these species as input. A central goal in this area is to design algorithms which guarantee reliable reconstruction of the tree from short input sequences, assuming common stochastic models of evolution. The fast converging reconstruction algorithms introduced in the last decade dramatically reduced the sequence length required to guarantee accurate reconstruction of the entire tree. However, if the tree in question contains even few edges which cannot be reliably reconstructed from the input sequences, then known fast converging algorithms may fail to reliably reconstruct all or most of the other edges. This calls for an adaptive approach suggested in this paper, called adaptive fast convergence, in which the set of edges which can be reliably reconstructed gradually increases with the amount of information (length of input sequences) available to the algorithm. This paper presents an adaptive fast converging algorithm which returns a partially resolved topology containing no false edges: edges that cannot be reliably reconstructed are contracted into high degree vertices. We also present an upper bound on the weights of those contracted edges, which is determined by the length of input sequences and the depth of the tree. As such, the reconstruction guarantee provided by our algorithm for individual edges is significantly stronger than any previously published edge reconstruction guarantee. This fact, together with the optimal complexity of our algorithm (linear space and quadratic-time), makes it appealing for practical use. © 2011 Wiley Periodicals, Inc. Random Struct. Alg., 40, 350–384, 2011 © 2012 Wiley Periodicals, Inc. (A preliminary version of this paper appears in (15).)