A new protein graph model for function prediction

  • Authors:
  • Marco A. Alvarez;Changhui Yan

  • Affiliations:
  • Department of Computer Science, Utah State University, Logan, UT 84322, USA;Department of Computer Science, North Dakota State University, Fargo, ND 58103, USA

  • Venue:
  • Computational Biology and Chemistry
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

As several structural proteomic projects are producing an increasing number of protein structures with unknown function, methods that can reliably predict protein functions from protein structures are in urgent need. In this paper, we present a method to explore the clustering patterns of amino acids on the 3-dimensional space for protein function prediction. First, amino acid residues on a protein structure are clustered into spatial groups using hierarchical agglomerative clustering, based on the distance between them. Second, the protein structure is represented using a graph, where each node denotes a cluster of amino acids. The nodes are labeled with an evolutionary profile derived from the multiple alignment of homologous sequences. Then, a shortest-path graph kernel is used to calculate similarities between the graphs. Finally, a support vector machine using this graph kernel is used to train classifiers for protein function prediction. We applied the proposed method to two separate problems, namely, prediction of enzymes and prediction of DNA-binding proteins. In both cases, the results showed that the proposed method outperformed other state-of-the-art methods.