Predicting functionally important residues from sequence conservation

Authors:
John A. Capra;Mona Singh
Affiliations:
-;-
Venue:
Bioinformatics
Year:
2007

Citing 0
Cited 8

Co-evolution and Information Signals in Biological Sequences

TAMC '09 Proceedings of the 6th Annual Conference on Theory and Applications of Models of Computation
Brief communication: RNA-binding residues in sequence space: Conservation and interaction patterns

Computational Biology and Chemistry
Topology Improves Phylogenetic Motif Functional Site Predictions

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Predicting Ligand Binding Residues and Functional Sites Using Multipositional Correlations with Graph Theoretic Clustering and Kernel CCA

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Functional site prediction by exploiting correlations between labels of interacting residues

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Research article: Relationship between global structural parameters and Enzyme Commission hierarchy: Implications for function prediction

Computational Biology and Chemistry
Meropenem: a potent drug against superbug as unveiled through bioinformatics approaches

International Journal of Bioinformatics Research and Applications
Predicting Protein-Ligand Binding Site Using Support Vector Machine with Protein Properties

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: All residues in a protein are not equally important. Some are essential for the proper structure and function of the protein, whereas others can be readily replaced. Conservation analysis is one of the most widely used methods for predicting these functionally important residues in protein sequences. Results: We introduce an information-theoretic approach for estimating sequence conservation based on Jensen–Shannon divergence. We also develop a general heuristic that considers the estimated conservation of sequentially neighboring sites. In large-scale testing, we demonstrate that our combined approach outperforms previous conservation-based measures in identifying functionally important residues; in particular, it is significantly better than the commonly used Shannon entropy measure. We find that considering conservation at sequential neighbors improves the performance of all methods tested. Our analysis also reveals that many existing methods that attempt to incorporate the relationships between amino acids do not lead to better identification of functionally important sites. Finally, we find that while conservation is highly predictive in identifying catalytic sites and residues near bound ligands, it is much less effective in identifying residues in protein–protein interfaces. Availability: Data sets and code for all conservation measures evaluated are available at http://compbio.cs.princeton.edu/conservation/ Contact: mona@cs.princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online.