Online dictionary matching with variable-length gaps

Authors:
Tuukka Haapasalo;Panu Silvasti;Seppo Sippu;Eljas Soisalon-Soininen
Affiliations:
Aalto University School of Science;Aalto University School of Science;University of Helsinki;Aalto University School of Science
Venue:
SEA'11 Proceedings of the 10th international conference on Experimental algorithms
Year:
2011

Citing 12
Cited 1

Matching a set of strings with variable length don't cares

Theoretical Computer Science
Efficient string matching: an aid to bibliographic search

Communications of the ACM
NR-grep: a fast and flexible pattern-matching tool

Software—Practice & Experience
Efficient pattern-matching with don't cares

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences

Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Dictionary matching and indexing with errors and don't cares

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Simple deterministic wildcard matching

Information Processing Letters
SAIL-APPROX: An Efficient On-Line Algorithm for Approximate Pattern Matching with Wildcards and Length Constraints

BIBM '07 Proceedings of the 2007 IEEE International Conference on Bioinformatics and Biomedicine
A faster algorithm for matching a set of patterns with variable length don't cares

Information Processing Letters
Regular expression matching with multi-strings and intervals

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
String matching with variable length gaps

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Finding patterns with variable length gaps or don’t cares

COCOON'06 Proceedings of the 12th annual international conference on Computing and Combinatorics

String matching with variable length gaps

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

The string-matching problem with wildcards is considered in the context of online matching of multiple patterns. Our patterns are strings of characters in the input alphabet and of variable-length gaps, where the width of a gap may vary between two integer bounds or from an integer lower bound to infinity. Our algorithm is based on locating "keywords" of the patterns in the input text, that is, maximal substrings of the patterns that contain only input characters. Matches of prefixes of patterns are collected from the keyword matches, and when a prefix constituting a complete pattern is found, a match is reported. In collecting these partial matches we avoid locating those keyword occurrences that cannot participate in any prefix of a pattern found thus far. Our experiments show that our algorithm scales up well, when the number of patterns increases.