Shade: a fast instruction-set simulator for execution profiling
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Characterizing processor architectures for programmable network interfaces
Proceedings of the 14th international conference on Supercomputing
CommBench-a telecommunications benchmark for network processors
ISPASS '00 Proceedings of the 2000 IEEE International Symposium on Performance Analysis of Systems and Software
Design issues for high-performance active routers
IEEE Journal on Selected Areas in Communications
Network processors: flexibility and performance for next-generation networks
ACM SIGCOMM Computer Communication Review
Design methodology for a modular service-driven network processor architecture
Computer Networks: The International Journal of Computer and Telecommunications Networking - Network processors
An efficient system-on-a-chip design methodology for networking applications
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
Optimizing software cache performance of packet processing applications
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Improving performance of digest caches in network processors
HiPC'08 Proceedings of the 15th international conference on High performance computing
Analysis of a reconfigurable network processor
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
AM-Trie: a high-speed parallel packet classification algorithm for network processor
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
NP-SARC: Scalable network processing in the SARC multi-core FPGA platform
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
Demands for flexible processing have moved general-purpose processing into the data path of networks. With the development of System-On-a-Chip technology, it is possible to put a number of processors with memory and I/O components on a single ASIC. We present a performance model of such a system and show how the number of processors, cache sizes, and the tradeoffs between the use of on-chip SRAM and DRAM can be optimized in terms of computation per unit chip area for a given workload. Based on a telecommunications benchmark the results of such an optimization are presented and design tradeoffs for Systems-on-a-Chip are identified and discussed.