RISP: A web-based server for prediction of RNA-binding sites in proteins
Computer Methods and Programs in Biomedicine
Acquisition of rule-based knowledge for analyzing DNA-binding sites in proteins
Proceedings of the 2nd international conference on Scalable information systems
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
Brief communication: RNA-binding residues in sequence space: Conservation and interaction patterns
Computational Biology and Chemistry
PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Human blood-brain differential gene-expression correlates with dipeptide frequency of gene products
ISBRA'08 Proceedings of the 4th international conference on Bioinformatics research and applications
Identification and analysis of binding site residues in protein complexes: energy based approach
ICIC'10 Proceedings of the 6th international conference on Advanced intelligent computing theories and applications: intelligent computing
Support vector machine for prediction of DNA-binding domains in protein-DNA complexes
LSMS'07 Proceedings of the 2007 international conference on Life System Modeling and Simulation
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 3.84 |
Motivation: Though vitally important to cell function, the mechanism of protein--DNA binding has not yet been completely understood. We therefore analysed the relationship between DNA binding and protein sequence composition, solvent accessibility and secondary structure. Using non-redundant databases of transcription factors and protein--DNA complexes, neural network models were developed to utilize the information present in this relationship to predict DNA-binding proteins and their binding residues. Results: Sequence composition was found to provide sufficient information to predict the probability of its binding to DNA with nearly 69% sensitivity at 64% accuracy for the considered proteins; sequence neighbourhood and solvent accessibility information were sufficient to make binding site predictions with 40% sensitivity at 79% accuracy. Detailed analysis of binding residues shows that some three- and five-residue segments frequently bind to DNA and that solvent accessibility plays a major role in binding. Although, binding behaviour was not associated with any particular secondary structure, there were interesting exceptions at the residue level. Over-representation of some residues in the binding sites was largely lost at the total sequence level, but a different kind of compositional preference was observed in DNA-binding proteins. Availability: Online predictions of DNA-binding proteins and binding sites are available at http://www.netasa.org/dbs-pred/