High-performance regular expression scanning on the Cell/B.E. processor
Proceedings of the 23rd international conference on Supercomputing
SCAMPI: a scalable CAM-based algorithm for multiple pattern inspection
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Proceedings of the 24th ACM International Conference on Supercomputing
High performance dictionary-based string matching for deep packet inspection
INFOCOM'10 Proceedings of the 29th conference on Information communications
Experiences with string matching on the fermi architecture
ARCS'11 Proceedings of the 24th international conference on Architecture of computing systems
Global Futures: A Multithreaded Execution Model for Global Arrays-based Applications
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
A Bandwidth-Optimized Multi-core Architecture for Irregular Applications
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Exploiting points-to maps for de-/serialization code generation
Proceedings of the 28th Annual ACM Symposium on Applied Computing
An efficient multicharacter transition string-matching engine based on the aho-corasick algorithm
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
String searching is at the core of many security and network applications like search engines, intrusion detection systems, virus scanners and spam filters. The growing size of on-line content and the increasing wire speeds push the need for fast, and often real-time, string searching solutions. For these conditions, many software implementations (if not all) targeting conventional cache-based microprocessors do not perform well. They either exhibit overall low performance or exhibit highly variable performance depending on the types of inputs. For this reason, real-time state of the art solutions rely on the use of either custom hardware or Field-Programmable Gate Arrays (FPGAs) at the expense of overall system flexibility and programmability. This paper presents a software based implementation of the Aho-Corasick string searching algorithm on the Cray XMT multithreaded shared memory machine. Our solution relies on the particular features of the XMT architecture and on several algorithmic strategies: it is fast, scalable and its performance is virtually content-independent. On a 128-processor Cray XMT, it reaches a scanning speed of ≈ 28 Gbps with a performance variability below 10%. In the 10 Gbps performance range, variability is below 2.5%. By comparison, an Intel dual-socket, 8-core system running at 2.66 GHz achieves a peak performance which varies from 500 Mbps to 10 Gbps depending on the type of input and dictionary size.