A memory-efficient algorithm for multiple sequence alignment with constraints

Authors:
Chin Lung Lu;Yen Pin Huang
Affiliations:
Department of Biological Science and Technology, National Chiao Tung University Hsinchu 300, Taiwan, Republic of China;Department of Biological Science and Technology, National Chiao Tung University Hsinchu 300, Taiwan, Republic of China
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 7

Efficient algorithms for regular expression constrained sequence alignment

Information Processing Letters
Constrained sequence alignment: A general model and the hardness results

Discrete Applied Mathematics
Research Article: Detecting conserved secondary structures in RNA molecules using constrained structural alignment

Computational Biology and Chemistry
Bit-Parallel Algorithm for the Constrained Longest Common Subsequence Problem

Fundamenta Informaticae
Efficient algorithms for regular expression constrained sequence alignment

CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Guided forest edit distance: better structure comparisons by using domain-knowledge

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Efficient parallel algorithm for multiple sequence alignments with regular expression constraints on graphics processing units

International Journal of Computational Science and Engineering

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Recently, the concept of the constrained sequence alignment was proposed to incorporate the knowledge of biologists about structures/functionalities/consensuses of their datasets into sequence alignment such that the user-specified residues/nucleotides are aligned together in the computed alignment. The currently developed programs use the so-called progressive approach to efficiently obtain a constrained alignment of several sequences. However, the kernels of these programs, the dynamic programming algorithms for computing an optimal constrained alignment between two sequences, run in O(γn2) memory, where γ is the number of the constraints and n is the maximum of the lengths of sequences. As a result, such a high memory requirement limits the overall programs to align short sequences~only. Results: We adopt the divide-and-conquer approach to design a memory-efficient algorithm for computing an optimal constrained alignment between two sequences, which greatly reduces the memory requirement of the dynamic programming approaches at the expense of a small constant factor in CPU time. This new algorithm consumes only O(αn) space, where α is the sum of the lengths of constraints and usually α n in practical applications. Based on this algorithm, we have developed a memory-efficient tool for multiple sequence alignment with constraints. Availability: http://genome.life.nctu.edu.tw/MUSICME Contact: cllu@mail.nctu.edu.tw