Scalable high speed IP routing lookups
SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
Eliminating receive livelock in an interrupt-driven kernel
ACM Transactions on Computer Systems (TOCS)
ACM Transactions on Computer Systems (TOCS)
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
Universal schemes for parallel communication
STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
System capability effects on algorithms for network bandwidth measurement
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Evaluating network processing efficiency with processor partitioning and asynchronous I/O
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
The slab allocator: an object-caching kernel memory allocator
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
ALS '01 Proceedings of the 5th annual Linux Showcase & Conference - Volume 5
Supercharging planetlab: a high performance, multi-application, overlay network platform
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Performance scalability of a multi-core web server
Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
OpenFlow: enabling innovation in campus networks
ACM SIGCOMM Computer Communication Review
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Scalable Parallel Programming with CUDA
Queue - GPU Computing
Pc-based software routers: high performance and application service support
Proceedings of the ACM workshop on Programmable routers for extensible services of tomorrow
Communications of the ACM
Exploiting the Power of GPUs for Asymmetric Cryptography
CHES '08 Proceeding sof the 10th international workshop on Cryptographic Hardware and Embedded Systems
Gnort: High Performance Network Intrusion Detection Using Graphics Processors
RAID '08 Proceedings of the 11th international symposium on Recent Advances in Intrusion Detection
Implementing an OpenFlow switch on the NetFPGA platform
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Practical symmetric key cryptography on modern graphics hardware
SS'08 Proceedings of the 17th conference on Security symposium
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness
Proceedings of the 36th annual international symposium on Computer architecture
PdP: parallelizing data plane in virtual network substrate
Proceedings of the 1st ACM workshop on Virtualized infrastructure systems and architectures
RouteBricks: exploiting parallelism to scale software routers
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Corey: an operating system for many cores
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
IP routing processing with graphic processors
Proceedings of the Conference on Design, Automation and Test in Europe
CRAFT: a new secure congestion control architecture
Proceedings of the 17th ACM conference on Computer and communications security
CloudPolice: taking access control out of the network
Hotnets-IX Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks
SideCar: building programmable datacenter networks without programmable switches
Hotnets-IX Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks
Distributed runtime load-balancing for software routers on homogeneous many-core processors
Proceedings of the Workshop on Programmable Routers for Extensible Services of Tomorrow
Controlling parallelism in a multicore software router
Proceedings of the Workshop on Programmable Routers for Extensible Services of Tomorrow
Forwarding path architectures for multicore software routers
Proceedings of the Workshop on Programmable Routers for Extensible Services of Tomorrow
Evaluating the suitability of server network cards for software routers
Proceedings of the Workshop on Programmable Routers for Extensible Services of Tomorrow
A cost comparison of datacenter network architectures
Proceedings of the 6th International COnference
Building extensible networks with rule-based forwarding
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
RouteBricks: enabling general purpose network infrastructure
ACM SIGOPS Operating Systems Review
SSLShader: cheap SSL acceleration with commodity processors
Proceedings of the 8th USENIX conference on Networked systems design and implementation
ServerSwitch: a programmable and high performance platform for data center networks
Proceedings of the 8th USENIX conference on Networked systems design and implementation
The case for VOS: the vector operating system
HotOS'13 Proceedings of the 13th USENIX conference on Hot topics in operating systems
Performance comparison of hardware virtualization platforms
NETWORKING'11 Proceedings of the 10th international IFIP TC 6 conference on Networking - Volume Part I
netmap: memory mapped access to network devices
Proceedings of the ACM SIGCOMM 2011 conference
Hermes: an integrated CPU/GPU microarchitecture for IP routing
Proceedings of the 48th Design Automation Conference
In the network: sandy bridge versus nehalem
ACM SIGMETRICS Performance Evaluation Review - Special Issue on IFIP PERFORMANCE 2011- 29th International Symposium on Computer Performance, Modeling, Measurement and Evaluation
Proceedings of the 2nd ACM Symposium on Cloud Computing
Small cache, big effect: provable load balancing for randomly partitioned cluster services
Proceedings of the 2nd ACM Symposium on Cloud Computing
Forty data communications research questions
ACM SIGCOMM Computer Communication Review
PTask: operating system abstractions to manage GPUs as compute devices
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
GPU-assisted AES encryption using GCM
CMS'11 Proceedings of the 12th IFIP TC 6/TC 11 international conference on Communications and multimedia security
MIDeA: a multi-parallel intrusion detection architecture
Proceedings of the 18th ACM conference on Computer and communications security
The middlebox manifesto: enabling innovation in middlebox deployment
Proceedings of the 10th ACM Workshop on Hot Topics in Networks
In-network processing of the GPU-based real-time DXT compression
Proceedings of The ACM CoNEXT Student Workshop
Leveraging Zipf's law for traffic offloading
ACM SIGCOMM Computer Communication Review
GHOST: GPGPU-offloaded high performance storage I/O deduplication for primary storage system
Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores
Compiling high throughput network processors
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Shredder: GPU-accelerated incremental storage and computation
FAST'12 Proceedings of the 10th USENIX conference on File and Storage Technologies
NaaS: network-as-a-service in the cloud
Hot-ICE'12 Proceedings of the 2nd USENIX conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services
Toward predictable performance in software packet-processing platforms
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
XIA: efficient support for evolvable internetworking
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Design and implementation of a consolidated middlebox architecture
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Flexible high performance traffic generation on commodity multi---core platforms
TMA'12 Proceedings of the 4th international conference on Traffic Monitoring and Analysis
On multi---gigabit packet capturing with multi---core commodity hardware
PAM'12 Proceedings of the 13th international conference on Passive and Active Measurement
Building a flexible and scalable virtual hardware data plane
IFIP'12 Proceedings of the 11th international IFIP TC 6 conference on Networking - Volume Part I
Caesar: a content router for high speed forwarding
Proceedings of the second edition of the ICN workshop on Information-centric networking
Building a power-proportional software router
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Netmap: a novel framework for fast packet I/O
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Gdev: first-class GPU resource management in the operating system
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
The power of batching in the Click modular router
Proceedings of the Asia-Pacific Workshop on Systems
GPUstore: harnessing GPU computing for storage systems in the OS kernel
Proceedings of the 5th Annual International Systems and Storage Conference
Multi-level Parallelism for Time- and Cost-Efficient Parallel Discrete Event Simulation on GPUs
PADS '12 Proceedings of the 2012 ACM/IEEE/SCS 26th Workshop on Principles of Advanced and Distributed Simulation
DXR: towards a billion routing lookups per second in software
ACM SIGCOMM Computer Communication Review
Kargus: a highly-scalable software-based intrusion detection system
Proceedings of the 2012 ACM conference on Computer and communications security
The power of batching in the click modular router
APSys'12 Proceedings of the Third ACM SIGOPS Asia-Pacific conference on Systems
Using vector interfaces to deliver millions of IOPS from a networked key-value storage server
Proceedings of the Third ACM Symposium on Cloud Computing
Generalized resource allocation for the cloud
Proceedings of the Third ACM Symposium on Cloud Computing
NetSlices: scalable multi-core packet processing in user-space
Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems
Wire-speed statistical classification of network traffic on commodity hardware
Proceedings of the 2012 ACM conference on Internet measurement conference
Bridging the gap between applications and networks in data centers
ACM SIGOPS Operating Systems Review
Revisiting flow-based load balancing: Stateless path selection in data center networks
Computer Networks: The International Journal of Computer and Telecommunications Networking
Comparison of caching strategies in modern cellular backhaul networks
Proceeding of the 11th annual international conference on Mobile systems, applications, and services
Wire speed name lookup: a GPU-based approach
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
SoNIC: precise realtime software access and control of wired networks
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Proceedings of the ACM International Conference on Computing Frontiers
Compressing IP forwarding tables: towards entropy bounds and beyond
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Supporting application-specific in-network processing in data centres
Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Indexing million of packets per second using GPUs
Proceedings of the 2013 conference on Internet measurement conference
ZMap: fast internet-wide scanning and its security applications
SEC'13 Proceedings of the 22nd USENIX conference on Security
Scalable, high performance ethernet forwarding with CuckooSwitch
Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Improving server application performance via pure TCP ACK receive optimization
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Toward a verifiable software dataplane
Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks
No silver bullet: extending SDN to the data plane
Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks
Verifiable network function outsourcing: requirements, challenges, and roadmap
Proceedings of the 2013 workshop on Hot topics in middleboxes and network function virtualization
GAMT: a fast and scalable IP lookup engine for GPU-based software routers
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Fast and flexible: parallel packet processing with GPUs and click
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Speeding up packet I/O in virtual machines
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
SWSL: software synthesis for network lookup
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Named data networking on a router: fast and dos-resistant forwarding with hash tables
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Rhythm: harnessing data parallel hardware for server workloads
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
GPU-accelerated name lookup with component encoding
Computer Networks: The International Journal of Computer and Telecommunications Networking
GPUfs: Integrating a file system with GPUs
ACM Transactions on Computer Systems (TOCS)
High-Performance network traffic processing systems using commodity hardware
DataTraffic Monitoring and Analysis
Queue - Large-Scale Implementations
Optimizing LZSS compression on GPGPUs
Future Generation Computer Systems
A grand spread estimator using a graphics processing unit
Journal of Parallel and Distributed Computing
A memory-efficient parallel routing lookup model with fast updates
Computer Communications
Green Networking With Packet Processing Engines: Modeling and Optimization
IEEE/ACM Transactions on Networking (TON)
Software dataplane verification
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
PHY covert channels: can you see the idles?
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
MICA: a holistic approach to fast in-memory key-value storage
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
NetVM: high performance and flexible networking using virtualization on commodity platforms
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
ClickOS and the art of network function virtualization
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
mTCP: a highly scalable user-level TCP stack for multicore systems
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
We present PacketShader, a high-performance software router framework for general packet processing with Graphics Processing Unit (GPU) acceleration. PacketShader exploits the massively-parallel processing power of GPU to address the CPU bottleneck in current software routers. Combined with our high-performance packet I/O engine, PacketShader outperforms existing software routers by more than a factor of four, forwarding 64B IPv4 packets at 39 Gbps on a single commodity PC. We have implemented IPv4 and IPv6 forwarding, OpenFlow switching, and IPsec tunneling to demonstrate the flexibility and performance advantage of PacketShader. The evaluation results show that GPU brings significantly higher throughput over the CPU-only implementation, confirming the effectiveness of GPU for computation and memory-intensive operations in packet processing.