A new approach to sequence comparison: normalized sequence alignment

Authors:
Abdullah N. Arslan;Ömer Eğecioğlu;Pavel A. Pevzner
Affiliations:
Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA;Department of Computer Science, University of California, Santa Barbara, Santa Barbara, CA;Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA
Venue:
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Year:
2001

Citing 10
Cited 5

The Normalized String Editing Problem Revisited

IEEE Transactions on Pattern Analysis and Machine Intelligence
A dictionary based approach for gene annotation

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Human and mouse gene structure: comparative analysis and application to exon prediction

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Dynamic Programming

Dynamic Programming
Computation of Normalized Edit Distance and Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast Computation of Normalized Edit Distances

IEEE Transactions on Pattern Analysis and Machine Intelligence
Parametric Recomuting in Alignment Graphs

CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
The Conserved Exon Method for Gene Finding

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Parallel Algorithms for Fast Computation of Normalized Edit Distances

SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
An Efficient Uniform-Cost Normalized Edit Distance Algorithm

SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware

Picking alignments from (steiner) trees

Proceedings of the sixth annual international conference on Computational biology
Learning Significant Alignments: An Alternative to Normalized Local Alignment

ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Statistical Identification of Uniformly Mutated Segments within Repeats

CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Simple and Practical Sequence Nearest Neighbors with Block Operations

CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
RNA secondary structure prediction via energy density minimization

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Smith-Waterman algorithm for local sequence alignment is one of the most important techniques in computational molecular biology. This ingenious dynamic programming approach was designed to reveal the highly conserved fragments by discarding poorly conserved initial and terminal segments. However, the existing notion of local similarity has a serious flaw: it does not discard poorly conserved intermediate segments. The Smith-Waterman algorithm finds the local alignment with maximal score but it is unable to find local alignment with maximum degree of similarity (e.g., maximal percent of matches). Moreover, there is still no efficient algorithm that answers the following natural question: do two sequences share a (sufficiently long) fragment with more than 70% of similarity? As a result, the local alignment sometimes produces a mosaic of well-conserved fragments artificially connected by poorly-conserved or even unrelated fragments. This may lead to problems in comparison of long genomic sequences and comparative gene prediction as recently pointed out by Zhang et al., 1999 [33]. In this paper we propose a new sequence comparison algorithm (normalized local alignment) that reports the regions with maximum degree of similarity. The algorithm is based on fractional programming and its running time is &Ogr;(n2 log n). In practice, normalized local alignment is only 3-5 times slower than the standard Smith-Waterman algorithm.