q-Gram Matching Using Tree Models

  • Authors:
  • Prahlad Fogla;Wenke Lee

  • Affiliations:
  • -;IEEE Computer Society

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

q{\hbox{-}}\rm gram matching is used for approximate substring matching problems in a wide range of application areas, including intrusion detection. In this paper, we present a tree-based model to perform fast linear time q{\hbox{-}}{\rm gram} matching. All q{\hbox{-}}{\rm grams} present in the text are stored in a tree structure similar to Trie. We use a tree redundancy pruning algorithm to reduce the size of the tree without losing any information. We also use suffix links for fast q{\hbox{-}}{\rm gram} search during query matching. We compare our work with the Rabin-Karp-based hash-table technique, commonly used for multiple q{\hbox{-}}{\rm gram} search. We present results of experiments on system call sequence data used for intrusion detection.