Optimal Exact Strring Matching Based on Suffix Arrays

Authors:
Mohamed Ibrahim Abouelhoda;Enno Ohlebusch;Stefan Kurtz
Affiliations:
-;-;-
Venue:
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Year:
2002

Citing 11
Cited 25

New indices for text: PAT Trees and PAT arrays

Information retrieval
Suffix arrays: a new method for on-line string searches

SIAM Journal on Computing
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
Fast algorithms for sorting and searching strings

SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Compressed suffix arrays and suffix trees with applications to text indexing and string matching (extended abstract)

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Reducing the space requirement of suffix trees

Software—Practice & Experience
An experimental study of an opportunistic index

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
The Enhanced Suffix Array and Its Applications to Genome Analysis

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications

CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Opportunistic data structures with applications

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Linear pattern matching algorithms

SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)

The Enhanced Suffix Array and Its Applications to Genome Analysis

WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
When indexing equals compression: experiments with compressing suffix arrays and applications

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Replacing suffix trees with enhanced suffix arrays

Journal of Discrete Algorithms - SPIRE 2002
Indexing text using the Ziv-Lempel trie

Journal of Discrete Algorithms - SPIRE 2002
Detecting higher-level similarity patterns in programs

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Matching statistics: efficient computation and a new practical algorithm for the multiple common substring problem

Software—Practice & Experience
Succinct suffix arrays based on run-length encoding

Nordic Journal of Computing
Linear work suffix array construction

Journal of the ACM (JACM)
Text indexing with errors

Journal of Discrete Algorithms
Fast blocking of undesirable web pages on client PC by discriminating URL using neural networks

Expert Systems with Applications: An International Journal
Faster index for property matching

Information Processing Letters
Improving suffix array locality for fast pattern matching on disk

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Implementing the LZ-index: Theory versus practice

Journal of Experimental Algorithmics (JEA)
Using Bloom Filters for Large Scale Gene Sequence Analysis in Haskell

PADL '09 Proceedings of the 11th International Symposium on Practical Aspects of Declarative Languages
Errata for “Faster index for property matching”

Information Processing Letters
Space efficient linear time construction of suffix arrays

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Simple linear work suffix array construction

ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Indexing circular patterns

WALCOM'08 Proceedings of the 2nd international conference on Algorithms and computation
A new efficient indexing algorithm for one-dimensional real scaled patterns

Journal of Computer and System Sciences
Inverted files versus suffix arrays for locating patterns in primary memory

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Succinct text indexes on large alphabet

TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
Space-efficient construction of LZ-index

ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Text indexing with errors

CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
A new compressed suffix tree supporting fast search and its construction algorithm using optimal working space

CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Time and space efficient search for small alphabets with suffix arrays

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Using the suffix tree of a string S, decision queries of the type "Is P a substring of S?" can be answered in O(|P|) time and enumeration queries of the type "Where are all z occurrences of P in S?" can be answered in O(|P|+z) time, totally independent of the size of S. However, in large scale applications as genome analysis, the space requirements of the suffix tree are a severe drawback. The suffix array is a more space economical index structure. Using it and an additional table, Manber and Myers (1993) showed that decision queries and enumeration queries can be answered in O(|P|+log |S|) and O(|P|+log |S|+z) time, respectively, but no optimal time algorithms are known. In this paper, we show how to achieve the optimal O(|P|) and O(|P| + z) time bounds for the suffix array. Our approach is not confined to exact pattern matching. In fact, it can be used to efficiently solve all problems that are usually solved by a top-down traversal of the suffix tree. Experiments show that our method is not only of theoretical interest but also of practical relevance.