Space-efficient multiple string matching automata

  • Authors:
  • Meng Zhang;Tianyu Yang;Rui Wu

  • Affiliations:
  • College of Computer Science and Technology, Jilin University, Changchun, China.;College of Computer Science and Technology, Jilin University, Changchun, China.;College of Computer Science and Technology, Jilin University, Changchun, China

  • Venue:
  • International Journal of Wireless and Mobile Computing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Aho-Corasick (AC) automaton is a data structure for multiple string matching. We present two compressing methods that enable the AC automaton to work on systems with limited resource such as mobile devices. By the first method, the AC automaton for a pattern set P over an alphabet of size σ needs (σ + 1)I + (1 + log|P| + logM)M + o(M) bits where M and I are the number of states and the number of non-leaf states of the AC automaton respectively, and a state transition takes O(1) time. By the second method, the space is I + (1 + log|P| + logM + log σ)M + o(M log σ) bits, and a state transition takes O(log log σ) time. We then combine the two methods together and archive trade-offs between the space and time complexity.