Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information

  • Authors:
  • Shandar Ahmad;M. Michael Gromiha;Akinori Sarai

  • Affiliations:
  • Department of Biochemical Science and Engineering, Kyushu Institute of Technology, Fukuoka, Iizuka 820 8502, Japan,;Computational Biology Research Center (CBRC), AIST, 2-41-6, Aomi, Koto-ku, Tokyo 135 0064, Japan;Department of Biochemical Science and Engineering, Kyushu Institute of Technology, Fukuoka, Iizuka 820 8502, Japan,

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Though vitally important to cell function, the mechanism of protein--DNA binding has not yet been completely understood. We therefore analysed the relationship between DNA binding and protein sequence composition, solvent accessibility and secondary structure. Using non-redundant databases of transcription factors and protein--DNA complexes, neural network models were developed to utilize the information present in this relationship to predict DNA-binding proteins and their binding residues. Results: Sequence composition was found to provide sufficient information to predict the probability of its binding to DNA with nearly 69% sensitivity at 64% accuracy for the considered proteins; sequence neighbourhood and solvent accessibility information were sufficient to make binding site predictions with 40% sensitivity at 79% accuracy. Detailed analysis of binding residues shows that some three- and five-residue segments frequently bind to DNA and that solvent accessibility plays a major role in binding. Although, binding behaviour was not associated with any particular secondary structure, there were interesting exceptions at the residue level. Over-representation of some residues in the binding sites was largely lost at the total sequence level, but a different kind of compositional preference was observed in DNA-binding proteins. Availability: Online predictions of DNA-binding proteins and binding sites are available at http://www.netasa.org/dbs-pred/