Machine Learning
Mining protein family specific residue packing patterns from protein structure graphs
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Search for folding nuclei in native protein structures
Bioinformatics
Hi-index | 0.01 |
Protein folding is frequently guided by local residue interactions that form clusters in the protein core. The interactions between residue clusters serve as potential nucleation sites in the folding process. Evidence postulates that the residue interactions are governed by the hydrophobic propensities that the residues possess. An array of hydrophobicity scales has been developed to determine the hydrophobic propensities of residues under different environmental conditions. In this work, we propose a graph-theory-based data mining framework to extract and isolate protein structural features that sustain invariance in evolutionary-related proteins, through the integrated analysis of five well-known hydrophobicity scales over the 3D structure of proteins. We hypothesize that proteins of the same homology contain conserved hydrophobic residues and exhibit analogous residue interaction patterns in the folded state. The results obtained demonstrate that discriminatory residue interaction patterns shared among proteins of the same family can be employed for both the structural and the functional annotation of proteins. We obtained on the average 90 percent accuracy in protein classification with a significantly small feature vector compared to previous results in the area. This work presents an elaborate study, as well as validation evidence, to illustrate the efficacy of the method and the correctness of results reported.