A Linear Time Approximation Scheme for Maximum Quartet Consistency on Sparse Sampled Inputs

Authors:
Sagi Snir;Raphael Yuster
Affiliations:
ssagi@math.haifa.ac.il;raphy@math.haifa.ac.il
Venue:
SIAM Journal on Discrete Mathematics
Year:
2011

Citing 11
Cited 0

Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology

Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Inferring evolutionary trees with strong combinatorial evidence

Theoretical Computer Science - computing and combinatorics
Approximation algorithms

Approximation algorithms
Approximating minimum quartet inconsistency: (abstract)

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
A Polynomial Time Approximation Scheme for Inferring Evolutionary Trees from Quartet Topologies and Its Application

SIAM Journal on Computing
Quartet Cleaning: Improved Algorithms and Simulations

ESA '99 Proceedings of the 7th Annual European Symposium on Algorithms
Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
A polynomial time algorithm for the minimum quartet inconsistency problem with O(n) quartet errors

Information Processing Letters
Reconstructing approximate phylogenetic trees from quartet samples

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
An experimental study of quartets MaxCut and other supertree methods

WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Quartets MaxCut: A Divide and Conquer Quartets Algorithm

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Phylogenetic tree reconstruction is a fundamental biological problem. Quartet amalgamation—combining a set of trees over four taxa into a tree over the full set—stands at the heart of many phylogenetic reconstruction methods. This task has attracted many theoretical as well as practical works. However, even reconstruction from a consistent set of quartet trees, i.e., all quartets agree with some tree, is NP-hard, and the best approximation ratio known is $1/3$. For a dense input of $\Theta(n^4)$ quartets that are not necessarily consistent, the problem has a polynomial time approximation scheme. When the number of taxa grows, considering such dense inputs is impractical and some sampling approach is imperative. It is known that given a randomly sampled consistent set of quartets from an unknown phylogeny, one can find, in polynomial time and with high probability, a tree satisfying a $0.425$ fraction of them, an improvement over the $1/3$ ratio. In this paper we further show that given a randomly sampled consistent set of quartets from an unknown phylogeny, where the size of the sample is at least $\Theta(n^2 \log n)$, there is a randomized approximation scheme that runs in linear time in the number of quartets. The previously known polynomial approximation scheme for that problem required a very dense sample of size $\Theta(n^4)$. We note that samples of size $\Theta(n^2 \log n)$ are sparse in the full quartet set. The result is obtained by a combinatorial technique that may be of independent interest.