Limits of instruction-level parallelism
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Locking effects in multiprocessor implementations of protocols
SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
Proceedings of the seventeenth ACM symposium on Operating systems principles
Handling of packet dependencies: a critical issue for highly parallel network processors
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
NpBench: A Benchmark Suite for Control plane and Data plane Applications for Network Processors
ICCD '03 Proceedings of the 21st International Conference on Computer Design
Snort - Lightweight Intrusion Detection for Networks
LISA '99 Proceedings of the 13th USENIX conference on System administration
Automatically partitioning packet processing applications for pipelined architectures
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Prototyping Architectural Support for Program Rollback Using FPGAs
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Overcoming the memory wall in packet processing: hammers or ladders?
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Evaluating Network Processors using NetBench
ACM Transactions on Embedded Computing Systems (TECS)
CommBench-a telecommunications benchmark for network processors
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Expressing and exploiting concurrency in networked applications with aspen
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Designing extensible IP router software
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
NetFPGA--An Open Platform for Gigabit-Rate Network Switching and Routing
MSE '07 Proceedings of the 2007 IEEE International Conference on Microelectronic Systems Education
Internet clean-slate design: what and why?
ACM SIGCOMM Computer Communication Review
PIN: a binary instrumentation tool for computer architecture research and education
WCAE '04 Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture
Custom code generation for soft processors
ACM SIGARCH Computer Architecture News - Special issue on the 2006 reconfigurable and adaptive architecture workshop
Configurable Transactional Memory
FCCM '07 Proceedings of the 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Experimental study of router buffer sizing
Proceedings of the 8th ACM SIGCOMM conference on Internet measurement
MultiLayer processing - an execution model for parallel stateful packet processing
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Scaling Soft Processor Systems
FCCM '08 Proceedings of the 2008 16th International Symposium on Field-Programmable Custom Computing Machines
Practice of parallelizing network applications on multi-core architectures
Proceedings of the 23rd international conference on Supercomputing
Towards 100 gbit/s ethernet: multicore-based parallel communication protocol design
Proceedings of the 23rd international conference on Supercomputing
On the Performance of Contention Managers for Complex Transactional Memory Benchmarks
ISPDC '09 Proceedings of the 2009 Eighth International Symposium on Parallel and Distributed Computing
Is transactional programming actually easier?
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Language and compiler support for stream programs
Language and compiler support for stream programs
Application-specific signatures for transactional memory in soft processors
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
NetTM: faster and easier synchronization for soft multicores via transactional memory
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Application-specific signatures for transactional memory in soft processors
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A regular expression matching engine with hybrid memories
Computer Standards & Interfaces
Hi-index | 0.00 |
Software packet processing is becoming more important to enable differentiated and rapidly-evolving network services. With increasing numbers of programmable processor and accelerator cores per network node, it is a challenge to support sharing and synchronization across them in a way that is scalable and easy-to-program. In this paper, we focus on parallel/threaded applications that have irregular control-flow and frequently-updated shared state that must be synchronized across threads. However, conventional lock-based synchronization is both difficult to use and also often results in frequent conservative serialization of critical sections. Alternatively, we propose that Transactional memory (TM) is a good match to software packet processing: it both (i) can allow the system to optimistically exploit parallelism between the processing of packets whenever it is safe to do so, and (ii) is easy-to-use for a programmer. With the NetFPGA [1] platform and four network packet processing applications that are threaded and share memory, we evaluate hardware support for TM (HTM) using the reconfigurable FPGA fabric. Relative to NetThreads [2], our two-processor four-way-multithreaded system with conventional lock-based synchronization, we find that adding HTM achieves 6%, 54% and 57% increases in packet throughput for three of four packet processing applications studied, due to reduced conservative serialization.