Computer Networks and ISDN Systems
Hitting the memory wall: implications of the obvious
ACM SIGARCH Computer Architecture News
Efficient fair queueing using deficit round robin
SIGCOMM '95 Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Cache behavior of network protocols
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Scalable high speed IP routing lookups
SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
IEEE/ACM Transactions on Networking (TON)
Fast address lookups using controlled prefix expansion
ACM Transactions on Computer Systems (TOCS)
Characterizing processor architectures for programmable network interfaces
Proceedings of the 14th international conference on Supercomputing
NetBench: a benchmarking suite for network processors
Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Improving route lookup performance using network processor cache
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A pipelined memory architecture for high throughput network processors
Proceedings of the 30th annual international symposium on Computer architecture
Efficient use of memory bandwidth to improve network processor throughput
Proceedings of the 30th annual international symposium on Computer architecture
Memory Hierarchy Design for a Multiprocessor Look-up Engine
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Architectural analysis and instruction-set optimization for design of network protocol processors
Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Network Systems Design Using Network Processors
Network Systems Design Using Network Processors
Tree bitmap: hardware/software IP lookups with incremental updates
ACM SIGCOMM Computer Communication Review
Managing memory access latency in packet processing
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
CommBench-a telecommunications benchmark for network processors
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Analysis of Network Processing Workloads
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Survey and taxonomy of IP address lookup algorithms
IEEE Network: The Magazine of Global Internetworking
Algorithms for packet classification
IEEE Network: The Magazine of Global Internetworking
Two-level mapping based cache index selection for packet forwarding engines
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Optimizing software cache performance of packet processing applications
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Frame shared memory: line-rate networking on commodity hardware
Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Journal of Systems Architecture: the EUROMICRO Journal
Runtime resource allocation in multi-core packet processing systems
HPSR'09 Proceedings of the 15th international conference on High Performance Switching and Routing
Network interfaces for programmable NICs and multicore platforms
Computer Networks: The International Journal of Computer and Telecommunications Networking
Improving performance of digest caches in network processors
HiPC'08 Proceedings of the 15th international conference on High performance computing
The case for hardware transactional memory in software packet processing
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
400 Gb/s Programmable Packet Parsing on a Single FPGA
Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
Advanced packet segmentation and buffering algorithms in network processors
Transactions on High-Performance Embedded Architectures and Compilers IV
A cache architecture for counting bloom filters: theory and application
Journal of Electrical and Computer Engineering
Hi-index | 0.00 |
Overhead of memory accesses limits the performance of packet processing applications. To overcome this bottleneck, today's network processors can utilize a wide-range of mechanisms-such as multi-level memory hierarchy, wide-word accesses, special-purpose result-caches, asynchronous memory, and hardware multi-threading. However, supporting all of these mechanisms complicates programmability and hardware design, and wastes systemresources. In this paper, we address the following fundamental question: what minimal set of hardware mechanisms must a network processor support to achieve the twin goals of simplified programmability and high packet throughput? We show that no single mechanism sufficies; the minimal set must include data-caches and multi-threading. Data-caches and multi-threading are complementary; whereas data-caches exploit locality to reduce the number of context-switches and the off-chip memory bandwidth requirement, multi-threading exploits parallelism to hide long cache-miss latencies.