PBPI: a high performance implementation of Bayesian phylogenetic inference

  • Authors:
  • Xizhou Feng;Kirk W. Cameron;Duncan A. Buell

  • Affiliations:
  • Virginia Polytechnic Institute and State University, Blacksburg, VA;Virginia Polytechnic Institute and State University, Blacksburg, VA;University of South Carolina, Columbia, SC

  • Venue:
  • Proceedings of the 2006 ACM/IEEE conference on Supercomputing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the implementation and performance of PBPI, a parallel implementation of Bayesian phylogenetic inference method for DNA sequence data. By combining the Markov Chain Monte Carlo (MCMC) method with likelihood-based assessment of phylogenies, Bayesian phylogenetic inferences can incorporate complex statistic models into the process of phylogenetic tree estimation. However, Bayesian analyses are extremely computationally expensive. PBPI uses algorithmic improvements and parallel processing to achieve significant performance improvement over comparable Bayesian phylogenetic inference programs. We evaluated the performance and accuracy of PBPI using a simulated dataset on System X, a terascale supercomputer at Virginia Tech. Our results show that PBPI identifies equivalent tree estimates 1424 times faster on 256 processors than a widely-used, best-available (albeit sequential), Bayesian phylogenetic inference program. PBPI also achieves linear speedup with the number of processors for large problem sizes. Most importantly, the PBPI framework enables Bayesian phylogenetic analysis of large datasets previously impracticable.