A Metric for Phylogenetic Trees Based on Matching

Authors:
Yu Lin;Vaibhav Rajan;Bernard M. E. Moret
Affiliations:
EPFL, Lausanne;EPFL, Lausanne;EPFL, Lausanne
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2012

Citing 13
Cited 2

Faster scaling algorithms for network problems

SIAM Journal on Computing
Kaikoura tree theorems: computing the maximum agreement subtree

Information Processing Letters
On the agreement of many trees

Information Processing Letters
Maximum Agreement Subtree in a Set of Evolutionary Trees: Metrics and Efficient Algorithms

SIAM Journal on Computing
Tree Contractions and Evolutionary Trees

SIAM Journal on Computing
On distances between phylogenetic trees

SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Theoretical Improvements in Algorithmic Efficiency for Network Flow Problems

Journal of the ACM (JACM)
An O(nlog n) Algorithm for the Maximum Agreement Subtree Problem for Binary Trees

SIAM Journal on Computing
Comparing clusterings---an information based distance

Journal of Multivariate Analysis
Computing the Distribution of a Tree Metric

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A unifying view on approximation and FPT of agreement forests

WABI'09 Proceedings of the 9th international conference on Algorithms in bioinformatics
Matching Split Distance for Unrooted Binary Phylogenetic Trees

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Uncovering hidden phylogenetic consensus

ISBRA'10 Proceedings of the 6th international conference on Bioinformatics Research and Applications

SibJoin: a fast heuristic for half-sibling reconstruction

WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Mining evolutionary multi-branch trees from text streams

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Comparing two or more phylogenetic trees is a fundamental task in computational biology. The simplest outcome of such a comparison is a pairwise measure of similarity, dissimilarity, or distance. A large number of such measures have been proposed, but so far all suffer from problems varying from computational cost to lack of robustness; many can be shown to behave unexpectedly under certain plausible inputs. For instance, the widely used Robinson-Foulds distance is poorly distributed and thus affords little discrimination, while also lacking robustness in the face of very small changes—reattaching a single leaf elsewhere in a tree of any size can instantly maximize the distance. In this paper, we introduce a new pairwise distance measure, based on matching, for phylogenetic trees. We prove that our measure induces a metric on the space of trees, show how to compute it in low polynomial time, verify through statistical testing that it is robust, and finally note that it does not exhibit unexpected behavior under the same inputs that cause problems with other measures. We also illustrate its usefulness in clustering trees, demonstrating significant improvements in the quality of hierarchical clustering as compared to the same collections of trees clustered using the Robinson-Foulds distance.