Support vector training of protein alignment models

Authors:
Chun-Nam John Yu;Thorsten Joachims;Ron Elber;Jaroslaw Pillardy
Affiliations:
Dept. of Computer Science, Cornell University, Ithaca, NY;Dept. of Computer Science, Cornell University, Ithaca, NY;Dept. of Computer Science, Cornell University, Ithaca, NY;Cornell Theory Center, Cornell University, Ithaca, NY
Venue:
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Year:
2007

Citing 10
Cited 5

The nature of statistical learning theory

The nature of statistical learning theory
Learning String-Edit Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Sequence Similarity Search Algorithm Based on a Probabilistic Interpretation of an Alignment Scoring System

Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology
Inverse Parametric Sequence Alignment

COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
Support vector machine learning for interdependent and structured output spaces

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
CONTRAlign: discriminative training for protein sequence alignment

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Simple and fast inverse alignment

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology

Training structural SVMs when exact inference is intractable

Proceedings of the 25th international conference on Machine learning
Training structural svms with kernels using sampled cuts

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning Scoring Schemes for Sequence Alignment from Partial Examples

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cutting-plane training of structural SVMs

Machine Learning
Inverse sequence alignment from partial examples

WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequence to structure alignment is an important step in homology modeling of protein structures. Incorporation of features like secondary structure, solvent accessibility, or evolutionary information improve sequence to structure alignment accuracy, but conventional generative estimation techniques for alignment models impose independence assumptions that make these features difficult to include in a principled way. In this paper, we overcome this problem using a Support Vector Machine (SVM) method that provides a well-founded way of estimating complex alignment models with hundred-thousands of parameters. Furthermore, we show that the method can be trained using a variety of loss functions. In a rigorous empirical evaluation, the SVM algorithm outperforms the generative alignment method SSALN, a highly accurate generative alignment model that incorporates structural information. The alignment model learned by the SVM aligns 47% of the residues correctly and aligns over 70% of the residues within a shift of 4 positions.