Support vector training of protein alignment models

  • Authors:
  • Chun-Nam John Yu;Thorsten Joachims;Ron Elber;Jaroslaw Pillardy

  • Affiliations:
  • Dept. of Computer Science, Cornell University, Ithaca, NY;Dept. of Computer Science, Cornell University, Ithaca, NY;Dept. of Computer Science, Cornell University, Ithaca, NY;Cornell Theory Center, Cornell University, Ithaca, NY

  • Venue:
  • RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sequence to structure alignment is an important step in homology modeling of protein structures. Incorporation of features like secondary structure, solvent accessibility, or evolutionary information improve sequence to structure alignment accuracy, but conventional generative estimation techniques for alignment models impose independence assumptions that make these features difficult to include in a principled way. In this paper, we overcome this problem using a Support Vector Machine (SVM) method that provides a well-founded way of estimating complex alignment models with hundred-thousands of parameters. Furthermore, we show that the method can be trained using a variety of loss functions. In a rigorous empirical evaluation, the SVM algorithm outperforms the generative alignment method SSALN, a highly accurate generative alignment model that incorporates structural information. The alignment model learned by the SVM aligns 47% of the residues correctly and aligns over 70% of the residues within a shift of 4 positions.