Reconstruction of large phylogenetic trees: A parallel approach

Authors:
Zhihua Du;Feng Lin;Usman W. Roshan
Affiliations:
BioInformatics Research Centre, Nanyang Technological University, Nanyang Avenue, Singapore 639798, Singapore;BioInformatics Research Centre, Nanyang Technological University, Nanyang Avenue, Singapore 639798, Singapore;College of Computing Sciences, Computer Sciences Department, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA
Venue:
Computational Biology and Chemistry
Year:
2005

Citing 4
Cited 5

Allocating independent tasks to parallel processors: an experimental study

Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
Absolute convergence: true trees from short sequences

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Solving Large Scale Phylogenetic Problems using DCM2

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference

Exploring New Search Algorithms and Hardware for Phylogenetics: RAxML Meets the IBM Cell

Journal of VLSI Signal Processing Systems
Large-scale maximum likelihood-based phylogenetic analysis on the IBM BlueGene/L

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Large-scale phylogenetic analysis on current HPC architectures

Scientific Programming - Large-Scale Programming Tools and Environments
Establishing a statistic model for recognition of steroid hormone response elements

Computational Biology and Chemistry
A scalable parallelization of the gene duplication problem

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reconstruction of phylogenetic trees for very large datasets is a known example of a computationally hard problem. In this paper, we present a parallel computing model for the widely used Multiple Instruction Multiple Data (MIMD) architecture. Following the idea of divide-and-conquer, our model adapts the recursive-DCM3 decomposition method [Roshan, U., Moret, B.M.E., Williams, T.L., Warnow, T, 2004a. Performance of suptertree methods on various dtaset decompositions. In: Binida-Emonds, O.R.P. (Eds.), Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, vol. 3 of Computational Biology, Kluwer Academics, pp. 301-328; Roshan, U., Moret, B.M.E., Williams, T.L., Warnow, T., 2004b. Rec-I-DCM3: A Fast Algorithmic Technique for reconstructing large phylogenetic trees, Proceedings of the IEEE Computational Systems Bioinformatics Conference (ICSB)] to divide datasets into smaller subproblems. It distributes computation load over multiple processors so that each processor constructs subtrees on each subproblem within a batch in parallel. It finally collects the resulting trees and merges them into a supertree. The proposed model is flexible as far as methods for dividing and merging datasets are concerned. We show that our method greatly reduces the computational time of the sequential version of the program. As a case study, our parallel approach only takes 22.1h on four processors to outperform the best score to date (Found at 123.7h by the Rec-I-DCM3 program [Roshan, U., Moret, B.M.E., Williams, T.L., Warnow, T, 2004a. Performance of suptertree methods on various dtaset decompositions. In: Binida-Emonds, O.R.P. (Eds.), Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, vol. 3 of Computational Biology, Kluwer Academics, pp. 301-328; Roshan, U., Moret, B.M.E., Williams, T.L., Warnow, T., 2004b. Rec-I-DCM3: A Fast Algorithmic Technique for reconstructing large phylogenetic trees, Proceedings of the IEEE Computational Systems Bioinformatics Conference (ICSB)] on one dataset. Developed with the standard message-passing library, MPI, the program can be recompiled and run on any MIMD systems. on any MIMD systems.