Faster algorithms for optimal multiple sequence alignment based on pairwise comparisons

Authors:
Pankaj K. Agarwal;Yonatan Bilu;Rachel Kolodny
Affiliations:
Department of Computer Science, Duke University, Durham, NC;Department of Molecular Genetics, Weizmann Institute, Rehovot, Israel;Department of Biochemistry and Molecular Biophysics, Columbia University
Venue:
WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics
Year:
2005

Citing 7
Cited 0

The multiple sequence alignment problem in biology

SIAM Journal on Applied Mathematics
Sparse dynamic programming I: linear cost functions

Journal of the ACM (JACM)
Sparse dynamic programming II: convex and concave cost functions

Journal of the ACM (JACM)
Chaining multiple-alignment fragments in sub-quadratic time

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
The complexity of multiple sequence alignment with SP-score that is a metric

Theoretical Computer Science
A sub-quadratic sequence alignment algorithm for unrestricted cost matrices

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiple Sequence Alignment (MSA) is one of the most fundamental problems in computational molecular biology. The running time of the best known scheme for finding an optimal alignment, based on dynamic programming, increases exponentially with the number of input sequences. Hence, many heuristics were suggested for the problem. We consider the following version of the MSA problem: In a preprocessing stage pairwise alignments are found for every pair of sequences. The goal is to find an optimal alignment in which matches are restricted to positions that were matched at the preprocessing stage. We present several techniques for making the dynamic programming algorithm more efficient, while still finding an optimal solution under these restrictions. Namely, in our formulation the MSA must conform with pairwise (local) alignments, and in return can be solved more efficiently. We prove that it suffices to find an optimal alignment of sequence segments, rather than single letters, thereby reducing the input size and thus improving the running time.