Theoretical Computer Science
A very fast substring search algorithm
Communications of the ACM
Theoretical Computer Science
Text algorithms
A fast string searching algorithm
Communications of the ACM
Factor Oracle: A New Structure for Pattern Matching
SOFSEM '99 Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics
Substring search and repeat search using factor oracles
Information Processing Letters
LZgrep: a Boyer–Moore string matching tool for Ziv–Lempel compressed text: Research Articles
Software—Practice & Experience
Converting suffix trees into factor/suffix oracles
Journal of Discrete Algorithms
On the Structure of Consistent Partitions of Substring Set of a Word
FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
A Bit-Parallel Exact String Matching Algorithm for Small Alphabet
FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
General suffix automaton construction algorithm and space bounds
Theoretical Computer Science
Substring search and repeat search using factor oracles
Information Processing Letters
A new taxonomy of sublinear right-to-left scanning keyword pattern matching algorithms
Science of Computer Programming
CIAA'06 Proceedings of the 11th international conference on Implementation and Application of Automata
A general weighted grammar library
CIAA'04 Proceedings of the 9th international conference on Implementation and Application of Automata
A partition-based efficient algorithm for large scale multiple-strings matching
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Fast matching method for DNA sequences
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
Probabilistic Arithmetic Automata and Their Applications
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Efficient representation of DNA data for pattern recognition using failure factor oracles
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Hi-index | 0.00 |
We introduce a new notion of weak factor recognition that is the foundation of new data structures and on-line string matching algorithms. We define a new automaton built on a string p = p1p2 ... pm that acts like an oracle on the set of factors pi ... pj. If a string is recognized by this automaton, it may be a factor of p. But, if it is rejected, it is surely not a factor. We call it factor oracle. More precisely, this automaton is acyclic, recognizes at least the factors of p, has m+ 1 states and a linear number of transitions. We give a very simple sequential construction algorithm to build it. Using this automaton, we design an efficient experimental on-line string matching algorithm (we conjecture its optimality in regard to the experimental results) that is really simple to implement. We also extend the factor oracle to predict that a string could be a suffix (i.e. in the set pi ... pm) of p. We obtain the suffix oracle, that enables in some cases a tricky improvement of the previous string matching algorithm.