Maximum likelihood of evolutionary trees: hardness and approximation

  • Authors:
  • Benny Chor;Tamir Tuller

  • Affiliations:
  • School of Computer Science, Tel-Aviv University Tel-Aviv, Israel;School of Computer Science, Tel-Aviv University Tel-Aviv, Israel

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Maximum likelihood (ML) is an increasingly popular optimality criterion for selecting evolutionary trees. Yet the computational complexity of ML was open for over 20 years, and only recently resolved by the authors for the Jukes--Cantor model of substitution and its generalizations. It was proved that reconstructing the ML tree is computationally intractable (NP-hard). In this work we explore three directions, which extend that result. Results: (1) We show that ML under the assumption of molecular clock is still computationally intractable (NP-hard). (2) We show that not only is it computationally intractable to find the exact ML tree, even approximating the logarithm of the ML for any multiplicative factor smaller than 1.00175 is computationally intractable. (3) We develop an algorithm for approximating log-likelihood under the condition that the input sequences are sparse. It employs any approximation algorithm for parsimony, and asymptotically achieves the same approximation ratio. We note that ML reconstruction for sparse inputs is still hard under this condition, and furthermore many real datasets satisfy it. Contact: tamirtul@post.tau.ac.il