Parametric optimization of sequence alignment

Authors:
D. Gusfield;K. Balasubramanian;D. Naor
Affiliations:
-;-;-
Venue:
SODA '92 Proceedings of the third annual ACM-SIAM symposium on Discrete algorithms
Year:
1992

Citing 3
Cited 6

Mathematical Techniques for Efficient Record Segmentation in Large Shared Databases

Journal of the ACM (JACM)
Parametric Combinatorial Computing and a Problem of Program Module Distribution

Journal of the ACM (JACM)
Sequence Analysis in Molecular Biology: Treasure Trove or Trivial Pursuit

Sequence Analysis in Molecular Biology: Treasure Trove or Trivial Pursuit

New flexible approaches for multiple sequence alignment

RECOMB '97 Proceedings of the first annual international conference on Computational molecular biology
Fast and numerically stable parametric alignment of biosequences

RECOMB '97 Proceedings of the first annual international conference on Computational molecular biology
Optimal detection of sequence similarity by local alignment

RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
A simple iterative approach to parameter optimization

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Parametric Sequence Alignment with Constraints

Constraints
Pareto Optimal Pairwise Sequence Alignment

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.01

Visualization

Abstract

The optimal alignment or the weighted minimum edit distance between two DNA or amino acid sequences for a given set of weights is computed by classical dynamic programming techniques, and is widely used in Molecular Biology. However, in DNA and amino acid sequences there is considerable disagreement about how to weight matches, mismatches, insertions/deletions (indels) and gaps. Parametric Sequence alignment is the problem of computing the optimal valued alignment between two sequences as a function of variable weights for matches, mismatches, spaces and gaps. The goal is to partition the parameter space into regions (which are necessarily convex) such that in each region one alignment is optimal throughout and such that the regions are maximal for this property. In this paper we are primarily concerned with the structure of this convex decomposition, and secondarily with the complexity of computing the decomposition. The most striking results are the following: For the special case where only matches, mismatches and spaces are counted, and where spaces are counted throughout the alignment, we show that the decomposition is surprisingly simple: all regions are infinite; there are at most n2/3 regions; the lines that bound the regions are all of the form &bgr; = c+(c + 0.5)&agr;; and the entire decomposition can be found in O(knm) time, where k is the actual number of regions and n are the lengths of the two strings. These results were found while implementing a large software package to do parametric sequence analysis, and in turn have led to faster algorithms for those tasks.