Predicate calculus and program semantics
Predicate calculus and program semantics
A very fast substring search algorithm
Communications of the ACM
Algorithms for finding patterns in strings
Handbook of theoretical computer science (vol. A)
A taxonomy of sublinear multiple keyword pattern matching algorithms
Science of Computer Programming
Pattern matching algorithms
A fast string searching algorithm
Communications of the ACM
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Fast and flexible string matching by combining bit-parallelism and suffix automata
Journal of Experimental Algorithmics (JEA)
A Discipline of Programming
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
An Efficient Algorithm for Matching Multiple Patterns
IEEE Transactions on Knowledge and Data Engineering
A String Matching Algorithm Fast on the Average
Proceedings of the 6th Colloquium, on Automata, Languages and Programming
Efficient Experimental String Matching by Weak Factor Recognition
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
SPARE parts: a C++ toolkit for string pattern recognition
Software—Practice & Experience
Algorithms on Strings
Performance assessment of dead-zone single keyword pattern matching
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Efficient representation of DNA data for pattern recognition using failure factor oracles
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Hi-index | 0.00 |
A new taxonomy of sublinear (multiple) keyword pattern matching algorithms is presented. Based on an earlier taxonomy by the second and third authors, this new taxonomy includes not only suffix-based algorithms, but also factor- and factor-oracle-based algorithms. In particular, we show how suffix-based (Commentz-Walter like), factor- and factor-oracle-based sublinear keyword pattern matching algorithms can be seen as instantiations of a general sublinear algorithm skeleton. During processing, such algorithms shift or jump through the text in a forward or left-to-right direction, and read backward or right-to-left starting from positions in the text, i.e. they read suffixes of certain prefixes of the text. They use finite automata for efficient computation of string membership in a certain language. In addition, we show shift functions defined for the suffix-based algorithms to be reusable for factor- and factor-oracle-based algorithms. The taxonomy is based on deriving the algorithms from a common starting point by adding algorithm and problem details, to arrive at efficient or well-known algorithms. Such a presentation provides correctness arguments for the algorithms as well as clarity on how the algorithms are related to one another. In addition, it is helpful in the construction of a toolkit of the algorithms.