Average complexity of exact and approximate multiple string matching

Authors:
Gonzalo Navarro;Kimmo Fredriksson
Affiliations:
Department of Computer Science, University of Chile, Santiago, Chile;Department of Computer Science, University of Helsinki, Finland and Department of Computer Science, University of Joensuu, Finland
Venue:
Theoretical Computer Science
Year:
2004

Citing 5
Cited 11

An improved algorithm for approximate string matching

SIAM Journal on Computing
Text algorithms

Text algorithms
Efficient string matching: an aid to bibliographic search

Communications of the ACM
Approximate String Matching and Local Similarity

CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Average-optimal multiple approximate string matching

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching

Average-optimal single and multiple approximate string matching

Journal of Experimental Algorithmics (JEA)
Sequential and indexed two-dimensional combinatorial template matching allowing rotations

Theoretical Computer Science
Succinct backward-DAWG-matching

Journal of Experimental Algorithmics (JEA)
A no-word-segmentation hierarchical clustering approach to Chinese web search results

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
MPSCAN: fast localisation of multiple reads in genomes

WABI'09 Proceedings of the 9th international conference on Algorithms in bioinformatics
On the bit-parallel simulation of the nondeterministic Aho-Corasick and suffix automata for a set of patterns

Journal of Discrete Algorithms
A partition-based efficient algorithm for large scale multiple-strings matching

SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Average complexity of backward q-gram string matching algorithms

Information Processing Letters
Fast multiple string matching using streaming SIMD extensions technology

SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Approximate regional sequence matching for genomic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Exact online two-dimensional pattern matching using multiple pattern matching algorithms

Journal of Experimental Algorithmics (JEA)

Quantified Score

Hi-index	5.23

Visualization

Abstract

We show that the average number of characters examined to search for r random patterns of length m in a text of length n over a uniformly distributed alphabet of size σ cannot be less than Ω(n logσ(rm)/m). When we permit up to k insertions, deletions, and/or substitutions of characters in the occurrences of the patterns, the lower bound becomes Ω(n(k + logσ(rm))/m). This generalizes previous single-pattern lower bounds of Yao (for exact matching) and of Chang and Marr (for approximate matching), and proves the optimality of several existing multipattern search algorithms.