Phylogenetic models of rate heterogeneity: a high performance computing perspective

  • Authors:
  • Alexandros Stamatakis

  • Affiliations:
  • Institute of Computer Science, Foundation for Research and Technology-Hellas, Crete, Greece

  • Venue:
  • IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Inference of phylogenetic trees using the maximum likelihood (ML) method is NP-hard. Furthermore, the computation of the likelihood function for huge trees of more than 1,000 organisms is computationally intensive due to a large amount of floating point operations and high memory consumption. Within this context, the present paper compares two competing mathematical models that account for evolutionary rate heterogeneity: the Γ and CAT models. The intention of this paper is to show that-from a purely empirical point of view-CAT can be used instead of Γ. The main advantage of CAT over Γ consists in significantly lower memory consumption and faster inference times. An experimental study using RAxML has been performed on 19 real-world datasets comprising 73 up to 1, 663 DNA sequences. Results show that CAT is on average 5.5 times faster than Γ and-surprisingly enough-also yields trees with slightly superior Γ likelihood values. The usage of the CAT model decreases the amount of average L2 and L3 cache misses by factor 8.55.