GPU-based NFA implementation for memory efficient high speed regular expression matching

Authors:
Yuan Zu;Ming Yang;Zhonghu Xu;Lin Wang;Xin Tian;Kunyang Peng;Qunfeng Dong
Affiliations:
University of Science and Technology of China, Hefei, Anhui, China;University of Science and Technology of China, Hefei, Anhui, China;University of Science and Technology of China, Hefei, Anhui, China;University of Science and Technology of China, Hefei, Anhui, China;University of Science and Technology of China, Hefei, Anhui, China;University of Science and Technology of China, Hefei, Anhui, China;University of Science and Technology of China, Hefei, Anhui, China
Venue:
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Year:
2012

Citing 24
Cited 7

Algorithms to accelerate multiple regular expressions matching for deep packet inspection

Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Fast and memory-efficient regular expression matching for deep packet inspection

Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
An improved algorithm to accelerate regular expression evaluation

Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
Curing regular expressions matching algorithms from insomnia, amnesia, and acalculia

Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
A hybrid finite automaton for practical deep packet inspection

CoNEXT '07 Proceedings of the 2007 ACM CoNEXT conference
XFA: Faster Signature Matching with Extended Automata

SP '08 Proceedings of the 2008 IEEE Symposium on Security and Privacy
Deflating the big bang: fast and scalable deep packet inspection with extended finite automata

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Gnort: High Performance Network Intrusion Detection Using Graphics Processors

RAID '08 Proceedings of the 11th international symposium on Recent Advances in Intrusion Detection
Efficient regular expression evaluation: theory to practice

Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
OpenMP to GPGPU: a compiler framework for automatic translation and optimization

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Extending finite automata to efficiently match Perl-compatible regular expressions

CoNEXT '08 Proceedings of the 2008 ACM CoNEXT Conference
Regular Expression Matching on Graphics Hardware for Intrusion Detection

RAID '09 Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection
An adaptive performance modeling tool for GPU architectures

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Model-driven autotuning of sparse matrix-vector multiply on GPUs

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Fast tridiagonal solvers on the GPU

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
iNFAnt: NFA pattern matching on GPGPU devices

ACM SIGCOMM Computer Communication Review
Fast regular expression matching using small TCAMs for network intrusion detection and prevention systems

USENIX Security'10 Proceedings of the 19th USENIX conference on Security
GRace: a low-overhead mechanism for detecting data races in GPU programs

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Auto-tuning of fast fourier transform on graphics processors

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Accelerating CUDA graph algorithms at maximum warp

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Achieving a single compute device image in OpenCL for multiple GPUs

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
TCAM-based DFA deflation: a novel approach to fast and scalable regular expression matching

Proceedings of the Nineteenth International Workshop on Quality of Service
Chain-Based DFA Deflation for Fast and Scalable Regular Expression Matching Using TCAM

Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems

TCAM-based NFA implementation

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Multi-gigabit traffic identification on GPU

Proceedings of the first edition workshop on High performance and programmable networking
Wire speed name lookup: a GPU-based approach

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
GPU acceleration of regular expression matching for large datasets: exploring the implementation space

Proceedings of the ACM International Conference on Computing Frontiers
Challenging the "embarrassingly sequential": parallelizing finite state machine-based computations through principled speculation

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
GPU-accelerated name lookup with component encoding

Computer Networks: The International Journal of Computer and Telecommunications Networking
Reviewing traffic classification

DataTraffic Monitoring and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Regular expression pattern matching is the foundation and core engine of many network functions, such as network intrusion detection, worm detection, traffic analysis, web applications and so on. DFA-based solutions suffer exponentially exploding state space and cannot be remedied without sacrificing matching speed. Given this scalability problem of DFA-based methods, there has been increasing interest in NFA-based methods for memory efficient regular expression matching. To achieve high matching speed using NFA, it requires potentially massive parallel processing, and hence represents an ideal programming task on Graphic Processor Unit (GPU). Based on in-depth understanding of NFA properties as well as GPU architecture, we propose effective methods for fitting NFAs into GPU architecture through proper data structure and parallel programming design, so that GPU's parallel processing power can be better utilized to achieve high speed regular expression matching. Experiment results demonstrate that, compared with the existing GPU-based NFA implementation method [9], our proposed methods can boost matching speed by 29~46 times, consistently yielding above 10Gbps matching speed on NVIDIA GTX-460 GPU. Meanwhile, our design only needs a small amount of memory space, growing exponentially more slowly than DFA size. These results make our design an effective solution for memory efficient high speed regular expression matching, and clearly demonstrate the power and potential of GPU as a platform for memory efficient high speed regular expression matching.