Multiple sequence alignment based on profile alignment of intermediate sequences

Authors:
Yue Lu;Sing-Hoi Sze
Affiliations:
Department of Biochemistry & Biophysics, Texas A&M University, College Station, TX;Department of Biochemistry & Biophysics and Department of Computer Science, Texas A&M University, College Station, TX
Venue:
RECOMB'07 Proceedings of the 11th annual international conference on Research in computational molecular biology
Year:
2007

Citing 5
Cited 2

A comparison of scoring functions for protein sequence profile alignment

Bioinformatics
Align-m---a new algorithm for multiple alignment of highly divergent sequences

Bioinformatics
SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures

Bioinformatics
Probalign: multiple sequence alignment using partition function posterior probabilities

Bioinformatics
CONTRAlign: discriminative training for protein sequence alignment

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology

Learning Scoring Schemes for Sequence Alignment from Partial Examples

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A Knowledge-Based Multiple-Sequence Alignment Algorithm

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite considerable efforts, it remains difficult to obtain accurate multiple sequence alignments. By using additional hits from database search of the input sequences, a few strategies have been proposed to significantly improve alignment accuracy, including the construction of profiles from the hits while performing profile alignment, the inclusion of high scoring hits into the input sequences, the use of intermediate sequence search to link distant homologs, and the use of secondary structure information. We develop an algorithm that integrates these strategies to further improve alignment accuracy by modifying the pair-HMM approach in ProbCons to incorporate profiles of intermediate sequences from database search and utilize secondary structure predictions as in SPEM. We test our algorithm on a few sets of benchmark multiple alignments, including BAliBASE, HOMSTRAD, PREFAB and SAB-mark, and show that it significantly outperforms MAFFT and ProbCons, which are among the best multiple alignment algorithms that do not utilize additional information, and SPEM, which is among the best multiple alignment algorithms that utilize additional hits from database search. The improvement in accuracy over SPEM can be as much as 5 to 10% when aligning divergent sequences. A software program that implements this approach (ISPAlign) is at http://faculty.cs.tamu.edu/shsze/ispalign.