Parallel solution of large-scale eigenvalue problem for master equation in protein folding dynamics

  • Authors:
  • Yiming Li;Shao-Ming Yu;Yih-Lang Li

  • Affiliations:
  • Department of Communication Engineering, National Chiao Tung University, Hsinchu, Taiwan;Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan;Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

It is known that a master equation characterizes time evolution of trajectories and transition of states in protein folding dynamics. Solution of the master equation may require calculating eigenvalues for the corresponding eigenvalue problem. In this paper, we numerically study the folding rate for a dynamic problem of protein folding by solving a large-scale eigenvalue problem. Three methods, the implicitly restarted Arnoldi, Jacobi-Davidson, and QR methods are employed in solving the corresponding large-scale eigenvalue problem for the transition matrix of master equation. Comparison shows that the QR method demands tremendous computing resource when the length of sequence L10 due to extremely large size of matrix and CPU time limitation. The Jacobi-Davidson method may encounter convergence issue, for cases of L9. The implicitly restarted Arnoldi method is suitable for solving problems among them. Parallelization of the implicitly restarted Arnoldi method is successfully implemented on a PC-based Linux cluster. The parallelization scheme mainly partitions the operation of matrix. For the Arnoldi factorization, we replicate the upper Hessenberg matrix H"m for each processor, and distribute the set of Arnoldi vectors V"m among processors. Each processor performs its own operation. The algorithm is implemented on a PC-based Linux cluster with message passing interface (MPI) libraries. Numerical experiment performing on our 32-nodes PC-based Linux cluster shows that the maximum difference among processors is within 10%. A 23-times speedup and 72% parallel efficiency are achieved when the matrix size is greater than 2x10^6 on the 32-nodes PC-based Linux cluster. This parallel approach enables us to explore large-scale dynamics of protein folding.