A Fast Algorithm on Average for All-Against-All Sequence Matching

Authors:
Ricardo A. Baeza-Yates;Gaston H. Gonnet
Affiliations:
-;-
Venue:
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Year:
1999

Citing 0
Cited 14

A syntactic approach for searching similarities within sentences

Proceedings of the eleventh international conference on Information and knowledge management
Matchsimile: a flexible approximate matching tool for searching proper names

Journal of the American Society for Information Science and Technology
Approximate String Joins in a Database (Almost) for Free

Proceedings of the 27th International Conference on Very Large Data Bases
A Multiresolution Symbolic Representation of Time Series

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
EXTRA: a system for example-based translation assistance

Machine Translation
A dimensionality reduction technique for efficient time series similarity analysis

Information Systems
A graph approach to the threshold all-against-all substring matching problem

Journal of Experimental Algorithmics (JEA)
An efficient pattern matching algorithm for comparative Genome sequence analysis

ACC'08 Proceedings of the WSEAS International Conference on Applied Computing Conference
Comparative genome sequence analysis by efficient pattern matching technique

WSEAS Transactions on Information Science and Applications
Indexing methods for approximate dictionary searching: Comparative analysis

Journal of Experimental Algorithmics (JEA)
Compressed directed acyclic word graph with application in local alignment

COCOON'11 Proceedings of the 17th annual international conference on Computing and combinatorics
A new algorithm for fast all-against-all substring matching

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Scalable string similarity search/join with approximate seeds and multiple backtracking

Proceedings of the Joint EDBT/ICDT 2013 Workshops
Efficient fuzzy search in large text collections

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present an algorithm which attempts to align pairs of subsequences from a database of genetic sequences. The algorithm simulates the classical dynamic programming alignment algorithm over a suffix array of the database. We provide a detailed average case analysis which shows that the running time of the algorithm is sub-quadratic with respect to the database size. A similar algorithm solves the approximate string matching problem in sub-linear average time.