Protein secondary structure: entropy, correlations and prediction

  • Authors:
  • Gavin E. Crooks;Steven E. Brenner

  • Affiliations:
  • Department of Plant and Microbial Biology, University of California, 111 Koshland Hall No. 3102, Berkeley, CA 94720-3102, USA;Department of Plant and Microbial Biology, University of California, 111 Koshland Hall No. 3102, Berkeley, CA 94720-3102, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Is protein secondary structure primarily determined by local interactions between residues closely spaced along the amino acid backbone or by non-local tertiary interactions? To answer this question, we measure the entropy densities of primary and secondary structure sequences, and the local inter-sequence mutual information density. Results: We find that the important inter-sequence interactions are short ranged, that correlations between neighboring amino acids are essentially uninformative and that only one-fourth of the total information needed to determine the secondary structure is available from local inter-sequence correlations. These observations support the view that the majority of most proteins fold via a cooperative process where secondary and tertiary structure form concurrently. Moreover, existing single-sequence secondary structure prediction algorithms are almost optimal, and we should not expect a dramatic improvement in prediction accuracy. Availability: Both the data sets and analysis code are freely available from our Web site at http://compbio.berkeley.edu/