Towards a practical O(n log n) phylogeny algorithm

Authors:
Daniel G. Brown;Jakub Truszkowski
Affiliations:
David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada;David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada
Venue:
WABI'11 Proceedings of the 11th international conference on Algorithms in bioinformatics
Year:
2011

Citing 11
Cited 2

Determining the evolutionary tree using experiments

Journal of Algorithms
The ordinal quartet method

RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
A few logs suffice to build (almost) all trees: part II

Theoretical Computer Science
Computing the quartet distance between evolutionary trees

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
On the complexity of distance-based evolutionary tree reconstruction

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Noisy binary search and its applications

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal implementations of UPGMA and other common clustering algorithms

Information Processing Letters
Phylogenies without Branch Bounds: Contracting the Short, Pruning the Deep

RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Large-scale neighbor-joining with NINJA

WABI'09 Proceedings of the 9th international conference on Algorithms in bioinformatics
Fast error-tolerant quartet phylogeny algorithms

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Fast neighbor joining

ICALP'05 Proceedings of the 32nd international conference on Automata, Languages and Programming

Fast phylogenetic tree reconstruction using locality-sensitive hashing

WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Fast error-tolerant quartet phylogeny algorithms

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, we have identified a quartet phylogeny algorithm with O(n log n) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O(log log n) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost.