Parallel algorithms for Bayesian phylogenetic inference

  • Authors:
  • Xizhou Feng;Duncan A. Buell;John R. Rose;Peter J. Waddell

  • Affiliations:
  • Department of Computer Science and Engineering, University of South Carolina, Columbia, SC;Department of Computer Science and Engineering, University of South Carolina, Columbia, SC;Department of Computer Science and Engineering, University of South Carolina, Columbia, SC;Department of Statistics, University of South Carolina, Columbia, SC and Department of Biological Sciences, University of South Carolina, Columbia, SC

  • Venue:
  • Journal of Parallel and Distributed Computing - High-performance computational biology
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The combination of a Markov chain Monte Carlo (MCMC) method with likelihood-based assessment of phylogenies is becoming a popular alternative to direct likelihood optimization. However, MCMC, like maximum likelihood, is a computationally expensive method. To approximate the posterior distribution of phylogenies, a Markov chain is constructed, using the Metropolis algorithm, such that the chain has the posterior distribution of the parameters of phylogenies as its stationary distribution.This paper describes parallel algorithms and their MPI-based parallel implementation for MCMC-based Bayesian phylogenetic inference. Bayesian phylogenetic inference is computationally expensive both in time and in memory requirements. Our variations on MCMC and their implementation were done to permit the study of large phylogenetic problems. In our approach, we can distribute either entire chains or parts of a chain to different processors, since in current models the columns of the data are independent. Evaluations on a 32-node Beowulf cluster suggest the problem scales well. A number of important points are identified, including a superlinear speedup due to more effective cache usage and the point at which additional processors slow down the process due to communication overhead.