On the bit-parallel simulation of the nondeterministic Aho-Corasick and suffix automata for a set of patterns

  • Authors:
  • Domenico Cantone;Simone Faro;Emanuele Giaquinta

  • Affiliations:
  • Universití di Catania, Dipartimento di Matematica e Informatica, Viale Andrea Doria 6, I-95125 Catania, Italy;Universití di Catania, Dipartimento di Matematica e Informatica, Viale Andrea Doria 6, I-95125 Catania, Italy;Universití di Catania, Dipartimento di Matematica e Informatica, Viale Andrea Doria 6, I-95125 Catania, Italy

  • Venue:
  • Journal of Discrete Algorithms
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a method to simulate, using the bit-parallelism technique, the nondeterministic Aho-Corasick automaton and the nondeterministic suffix automaton induced by the trie and by the Directed Acyclic Word Graph for a set of patterns, respectively. When the prefix redundancy is nonnegligible, this method yields-if compared to the original bit-parallel encoding with no prefix factorization-a representation that requires smaller bit-vectors and, correspondingly, less words. In particular, if we restrict to single-word bit-vectors, more patterns can be packed into a word. We also present two simple algorithms, based on such a technique, for searching a set P of patterns in a text T of length n over an alphabet @S of size @s. Our algorithms, named Log-And and Backward-Log-And, require O((m+@s)@?m/w@?)-space, and work in O(n@?m/w@?) and O(n@?m/w@?l"m"i"n) worst-case searching time, respectively, where w is the number of bits in a computer word, m is the number of states of the automaton, and l"m"i"n is the length of the shortest pattern in P.