Finding a maximum likelihood tree is hard

Authors:
Benny Chor;Tamir Tuller
Affiliations:
Tel-Aviv University, Tel-Aviv, Israel;Tel-Aviv University, Tel-Aviv, Israel
Venue:
Journal of the ACM (JACM)
Year:
2006

Citing 4
Cited 7

Some optimal inapproximability results

Journal of the ACM (JACM)
Approximating Bounded Degree Instances of NP-Hard Problems

FCT '01 Proceedings of the 13th International Symposium on Fundamentals of Computation Theory
Maximum likelihood of evolutionary trees: hardness and approximation

Bioinformatics
A Short Proof that Phylogenetic Tree Reconstruction by Maximum Likelihood Is Hard

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Phylogenies without Branch Bounds: Contracting the Short, Pruning the Deep

RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Reassortment Networks for Investigating the Evolution of Segmented Viruses

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Efficient estimation of the accuracy of the maximum likelihood method for ancestral state reconstruction

Journal of Combinatorial Optimization
HPC selection of models of DNA substitution

Proceedings of the 9th International Conference on Computational Methods in Systems Biology
Fishing for minimum evolution trees with Neighbor-Nets

Information Processing Letters
High-performance computing selection of models of DNA substitution for multicore clusters

International Journal of High Performance Computing Applications
Is the Protein Model Assignment problem under linked branch lengths NP-hard?

Theoretical Computer Science

Quantified Score

Hi-index	0.01

Visualization

Abstract

Maximum likelihood (ML) is an increasingly popular optimality criterion for selecting evolutionary trees [Felsenstein 1981]. Finding optimal ML trees appears to be a very hard computational task, but for tractable cases, ML is the method of choice. In particular, algorithms and heuristics for ML take longer to run than algorithms and heuristics for the second major character based criterion, maximum parsimony (MP). However, while MP has been known to be NP-complete for over 20 years [Foulds and Graham, 1982; Day et al. 1986], such a hardness result for ML has so far eluded researchers in the field.An important work by Tuffley and Steel [1997] proves quantitative relations between the parsimony values of given sequences and the corresponding log likelihood values. However, a direct application of their work would only give an exponential time reduction from MP to ML. Another step in this direction has recently been made by Addario-Berry et al. [2004], who proved that ancestral maximum likelihood (AML) is NP-complete. AML “lies in between” the two problems, having some properties of MP and some properties of ML. Still, the AML proof is not directly applicable to the ML problem.We resolve the question, showing that “regular” ML on phylogenetic trees is indeed intractable. Our reduction follows the vertex cover reductions for MP [Day et al. 1986] and AML [Addario-Berry et al. 2004], but its starting point is an approximation version of vertex cover, known as gap vc. The crux of our work is not the reduction, but its correctness proof. The proof goes through a series of tree modifications, while controlling the likelihood losses at each step, using the bounds of Tuffley and Steel [1997]. The proof can be viewed as correlating the value of any ML solution to an arbitrarily close approximation to vertex cover.