Storing a Sparse Table with 0(1) Worst Case Access Time
Journal of the ACM (JACM)
Filtering search: a new approach to query answering
SIAM Journal on Computing
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Text algorithms
The string B-tree: a new data structure for string search in external memory and its applications
Journal of the ACM (JACM)
A fast string searching algorithm
Communications of the ACM
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Faster String Matching with Super-Alphabets
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Efficient Minimal Perfect Hashing in Nearly Minimal Space
STACS '01 Proceedings of the 18th Annual Symposium on Theoretical Aspects of Computer Science
A Bit-Parallel Approach to Suffix Automata: Fast Extended String Matching
CPM '98 Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching
Polynomial Hash Functions Are Reliable (Extended Abstract)
ICALP '92 Proceedings of the 19th International Colloquium on Automata, Languages and Programming
Indexing text using the Ziv-Lempel trie
Journal of Discrete Algorithms - SPIRE 2002
Geometric Burrows-Wheeler Transform: Linking Range Searching and Text Indexing
DCC '08 Proceedings of the Data Compression Conference
FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
Fast Searching in Packed Strings
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Succinct Text Indexing with Wildcards
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
MPSCAN: fast localisation of multiple reads in genomes
WABI'09 Proceedings of the 9th international conference on Algorithms in bioinformatics
Succinct dictionary matching with no slowdown
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Constant-Time word-size string matching
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Hi-index | 0.00 |
In this paper, we explore worst-case solutions for the problems of pattern and multi-pattern matching on strings in the RAM model with word length w. In the first problem, we have a pattern p of length m over an alphabet of size σ, and given any text T of length n, where each character is encoded using log s bit, we wish to find all occurrences of p. For the multi-pattern matching problem we have a set S of d patterns of total length m and a query on a text T consists in finding all the occurrences in T of the patterns in S (in the following we refer by occ to the number of reported occurrences). As each character of the text is encoded using log σ bits and we can read w bits in constant time in the RAM model, the best query time for the two problems which can only possibly be achieved by reading Θ(w/ log σ) consecutive characters, is O(nlog σ/w + occ). In this paper, we present two results. The first result is that using O(m) words of space, single pattern matching queries can be answered in time O(n(log m/m + log σ/w) + occ), and multiple pattern matching queries answered in time O(n(log d+log y+log log m/y + log σ/w)+ occ), and multiple pattern matching queries answered in time O(nlog d+log y+log logm/y + log σ/w )+occ), where y is the length of the shortest pattern. Our second result is a variant of the first result which uses the four Russian technique to remove the dependence on the shortest pattern length at the expense of using an additional space t. It answers to multi-pattern matching queries in time O(nlog d+log logσ t+log log m/logσ t + occ) using O(m + t) words of space.