Combinatorial optimization
SIAM Journal on Computing
Inverse parametric sequence alignment
Journal of Algorithms
Multiple alignment by aligning alignments
Bioinformatics
Support vector training of protein alignment models
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
CONTRAlign: discriminative training for protein sequence alignment
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Simple and fast inverse alignment
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Learning Scoring Schemes for Sequence Alignment from Partial Examples
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Learning Models for Aligning Protein Sequences with Predicted Secondary Structure
RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Hi-index | 0.00 |
When aligning biological sequences, the choice of parameter values for the alignment scoring function is critical. Small changes in gap penalties, for example, can yield radically different alignments. A rigorous way to compute parameter values that are appropriate for biological sequences is inverse parametric sequence alignment. Given a collection of examples of biologically correct alignments, this is the problem of finding parameter values that make the example alignments score close to optimal. We extend prior work on inverse alignment to partial examples and to an improved model based on minimizing the average error of the examples. Experiments on benchmark biological alignments show we can find parameters that generalize across protein families and that boost the recovery rate for multiple sequence alignment by up to 25%.