Alternative implementations of two-level adaptive branch prediction
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Efficient fair queueing using deficit round robin
SIGCOMM '95 Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Characterizing processor architectures for programmable network interfaces
Proceedings of the 14th international conference on Supercomputing
The Alpha 21264 Microprocessor
IEEE Micro
CommBench-a telecommunications benchmark for network processors
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
A flexible accelerator for layer 7 networking applications
Proceedings of the 39th annual Design Automation Conference
Bit section instruction set extension of ARM for embedded applications
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Increasing power efficiency of multi-core network processors through data filtering
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Bitwidth aware global register allocation
POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A New Survival Architecture for Network Processors
AISA '02 Proceedings of the First International Workshop on Advanced Internet Services and Applications
Network processor requirements and benchmarking
Computer Networks: The International Journal of Computer and Telecommunications Networking - Network processors
Profiling tools for hardware/software partitioning of embedded applications
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Enhancing the performance of 16-bit code using augmenting instructions
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
A pipelined memory architecture for high throughput network processors
Proceedings of the 30th annual international symposium on Computer architecture
Communications of the ACM - Program compaction
Reducing energy and delay using efficient victim caches
Proceedings of the 2003 international symposium on Low power electronics and design
Simple offset assignment in presence of subword data
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Efficient spill code for SDRAM
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Power efficient encoding techniques for off-chip data buses
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Processor Acceleration Through Automated Instruction Set Customization
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the conference on Design, automation and test in Europe - Volume 1
ACM Transactions on Embedded Computing Systems (TECS)
Cluster miss prediction for instruction caches in embedded networking applications
Proceedings of the 14th ACM Great Lakes symposium on VLSI
Balancing register allocation across threads for a multithreaded network processor
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
International Journal of Parallel Programming - Special issue: Workshop on application specific processors (WASP)
Design and implementation of correlating caches
Proceedings of the 2004 international symposium on Low power electronics and design
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
A Case for Clumsy Packet Processors
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Dynamic coalescing for 16-bit instructions
ACM Transactions on Embedded Computing Systems (TECS)
Methods for evaluating and covering the design space during early design development
Integration, the VLSI Journal
A tunable bus encoder for off-chip data buses
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Automated Custom Instruction Generation for Domain-Specific Processor Acceleration
IEEE Transactions on Computers
Overcoming the memory wall in packet processing: hammers or ladders?
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Architectural impact of stateful networking applications
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
VALVE: Variable Length Value Encoder for Off-Chip Data Buses.
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
The impact of traffic aggregation on the memory performance of networking applications
MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Effective thread management on network processors with compiler analysis
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Performance Models for Network Processor Design
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 41st annual Design Automation Conference
NPCryptBench: a cryptographic benchmark suite for network processors
MEDEA '05 Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
On-Chip Communication Architectures: System on Chip Interconnect
On-Chip Communication Architectures: System on Chip Interconnect
Conserving network processor power consumption by exploiting traffic variability
ACM Transactions on Architecture and Code Optimization (TACO)
Performance/area efficiency in chip multiprocessors with micro-caches
Proceedings of the 4th international conference on Computing frontiers
Journal of Systems Architecture: the EUROMICRO Journal
Low-power warp processor for power efficient high-performance embedded systems
Proceedings of the conference on Design, automation and test in Europe
Matrix-Stripe-Cache-Based Contiguity Transform for Fragmented Writes in RAID-5
IEEE Transactions on Computers
Program mapping onto network processors by recursive bipartitioning and refining
Proceedings of the 44th annual Design Automation Conference
Reconciling performance and programmability in networking systems
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
WCET-driven, code-size critical procedure cloning
SCOPES '08 Proceedings of the 11th international workshop on Software & compilers for embedded systems
Journal of Embedded Computing - Embeded Processors and Systems: Architectural Issues and Solutions for Emerging Applications
The impact of traffic aggregation on the memory performance of networking applications
Journal of Embedded Computing - Embeded Processors and Systems: Architectural Issues and Solutions for Emerging Applications
Dual-resource TCP/AQM for processing-constrained networks
IEEE/ACM Transactions on Networking (TON)
Prefetching with adaptive cache culling for striped disk arrays
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Journal of Systems Architecture: the EUROMICRO Journal
Energy-efficient encoding techniques for off-chip data buses
ACM Transactions on Embedded Computing Systems (TECS)
JFTL: A flash translation layer based on a journal remapping for flash memory
ACM Transactions on Storage (TOS)
Scalability and parallel execution of warp processing: dynamic hardware/software partitioning
International Journal of Parallel Programming
Journal of Systems Architecture: the EUROMICRO Journal
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Analysis of network processing workloads
Journal of Systems Architecture: the EUROMICRO Journal
Hardware acceleration for media/transaction applications in network processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Compiler-assisted leakage-aware loop scheduling for embedded VLIW DSP processors
Journal of Systems and Software
Workload characterization of stateful networking applications
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
Hardware-based load balancing for massive multicore architectures implementing power gating
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Optimizing throughput and latency under given power budget for network packet processing
INFOCOM'10 Proceedings of the 29th conference on Information communications
LATA: a latency and throughput-aware packet processing system
Proceedings of the 47th Design Automation Conference
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
Compiler assisted dynamic management of registers for network processors
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Architectural enhancements for network congestion control applications
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Commutative set: a language extension for implicit parallel programming
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Compiler-Supported Thread Management for Multithreaded Network Processors
ACM Transactions on Embedded Computing Systems (TECS)
Software—Practice & Experience
Predictive Model-Based Thermal Management for Network Applications
Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
Improving latency tolerance of network processors through simultaneous multithreading
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Power efficient instruction caches for embedded systems
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
SARA: combining stack allocation and register allocation
CC'06 Proceedings of the 15th international conference on Compiler Construction
Instruction set architectural guidelines for embedded packet-processing engines
Journal of Systems Architecture: the EUROMICRO Journal
Traffic-aware power optimization for network applications on multicore servers
Proceedings of the 49th Annual Design Automation Conference
Parcae: a system for flexible parallel execution
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Proceedings of the tenth ACM international conference on Embedded software
A self-tuning design methodology for power-efficient multi-core systems
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special section on adaptive power management for energy and temperature-aware computing systems
A framework for end-to-end verification and evaluation of register allocators
SAS'07 Proceedings of the 14th international conference on Static Analysis
Thermal prediction and scheduling of network applications on multicore processors
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Workload assignment considering NBTI degradation in multicore systems
ACM Journal on Emerging Technologies in Computing Systems (JETC) - Special Issue on Reliability and Device Degradation in Emerging Technologies and Special Issue on WoSAR 2011
Efficient journaling writeback schemes for reliable and high-performance storage systems
Personal and Ubiquitous Computing
Hi-index | 0.01 |
In this study we introduce NetBench, a benchmarking suite for network processors. NetBench contains a total of 9 applications that are representative of commercial applications for network processors. These applications are from all levels of packet processing; Small, low-level code fragments as well as large application level programs are included in the suite.Using SimpleScalar simulator we study the NetBench programs in detail and characterize the network processor workloads. We also compare key characteristics such as instructions per cycle, instruction distribution, branch prediction accuracy, and cache behavior with the programs from MediaBench. Although the aimed architectures are similar for MediaBench and NetBench suites, we show that these workloads have significantly different characteristics. Hence a separate benchmarking suite for network processors is a necessity. Finally, we present performance measurements from Intel IXP1200 Network Processor to show how NetBench can be utilized.