SIAM Journal on Computing
A new approach to text searching
Communications of the ACM
Tree pattern matching and subset matching in randomized O(nlog3m) time
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Faster algorithms for string matching with k mismatches
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
A fast string searching algorithm
Communications of the ACM
Verifying candidate matches in sparse and wildcard matching
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Efficient pattern-matching with don't cares
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Introduction to Algorithms
Faster Algorithms for String Matching Problems: Matching the Convolution Bound
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Simple deterministic wildcard matching
Information Processing Letters
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Pattern matching with wildcards using words of shorter length
Information Processing Letters
Space lower bounds for online pattern matching
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
European Journal of Combinatorics
Space lower bounds for online pattern matching
Theoretical Computer Science
Hi-index | 0.00 |
In pattern matching with character classes the goal is to find all occurrences of a pattern of length m in a text of length n, where each pattern position consists of an allowed set of characters from a finite alphabet @S. We present an FFT-based algorithm that uses a novel prime-numbers encoding scheme, which is logn/logm times faster than the fastest extant approaches, which are based on boolean convolutions. In particular, if m^|^@S^|=n^O^(^1^), our algorithm runs in time O(nlogm), matching the complexity of the fastest techniques for wildcard matching, a special case of our problem. A major advantage of our algorithm is that it allows a tradeoff between the running time and the RAM word size. Our algorithm also speeds up solutions to approximate matching with character classes problems-namely, matching with k mismatches and Hamming distance, as well as to the subset matching problem.