Advances in Rosetta protein structure prediction on massively parallel systems

  • Authors:
  • S. Raman;B. Qian;D. Baker;R. C. Walker

  • Affiliations:
  • Department of Biochemistry, University of Washington, Seattle, Washington;Department of Biochemistry, University of Washington, Seattle, Washington;University of Washington, Seattle, Washington;San Diego Supercomputer Center, University of California at San Diego, La Jolla, California

  • Venue:
  • IBM Journal of Research and Development
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the key challenges in computational biology is prediction of three-dimensional protein structures from amino-acid sequences. For most proteins, the "native state" lies at the bottom of a free-energy landscape. Protein structure prediction involves varying the degrees of freedom of the protein in a constrained manner until it approaches its native state. In the Rosetta protein structure prediction protocols, a large number of independent folding trajectories are simulated, and several lowest-energy results are likely to be close to the native state. The availability of hundred-teraflop, and shortly, petaflop, computing resources is revolutionizing the approaches available for protein structure prediction. Here, we discuss issues involved in utilizing such machines efficiently with the Rosetta code, including an overview of recent results of the Critical Assessment of Techniques for Protein Structure Prediction 7 (CASP7) in which the computationally demanding structure-refinement process was run on 16 racks of the IBM Blue Gene/L™ system at the IBM T. J. Watson Research Center. We highlight recent advances in high-performance computing and discuss future development paths that make use of the next-generation petascale (1012 floating-point operations per second) machines.