Functional neighbors: inferring relationships between nonhomologous protein families using family-specific packing motifs

  • Authors:
  • Deepak Bandyopadhyay;Jun Huan;Jinze Liu;Jan Prins;Jack Snoeyink;Wei Wang;Alexander Tropsha

  • Affiliations:
  • Department of Computational and Structural Chemistry, Collegeville, PA;Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS;Department of Computer Science, University of Kentucky, Lexington, KY;Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC;Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC;Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC;Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC

  • Venue:
  • IEEE Transactions on Information Technology in Biomedicine
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a new approach for inferring the functional relationships between non-homologous protein families by looking at statistical enrichment of alternative function predictions in classification hierarchies such as Gene Ontology (GO) and Structural Classification of Proteins (SCOP). Protein structures are represented by robust graphs, and the Fast frequent subgraph mining algorithm is applied to protein families to generate sets of family-specific packing motifs, i.e. amino acid residue packing patterns shared by most family members but infrequent in other proteins. The function of a protein is inferred by identifying in it motifs characteristic of a known family. We employ these familyspecific motifs to elucidate functional relationships between families in the GO and SCOP hierarchies. Specifically, we postulate that two families are functionally related if one family is statistically enriched by motifs characteristic of another family, i.e. if the number of proteins in a family containing a motif from another family is greater than expected by chance. This function inference method can help annotate proteins of unknown function, establish functional neighbors of existing families, and help specify alternate functions for known proteins.