A Scalable Parallel SoC Architecture for Network Processors
ISVLSI '05 Proceedings of the IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design
Resource efficiency of the GigaNetIC chip multiprocessor architecture
Journal of Systems Architecture: the EUROMICRO Journal
Power aware reconfigurable multiprocessor for elliptic curve cryptography
Proceedings of the conference on Design, automation and test in Europe
Journal of Systems Architecture: the EUROMICRO Journal
Runtime resource allocation in multi-core packet processing systems
HPSR'09 Proceedings of the 15th international conference on High Performance Switching and Routing
A multiprocessor cache for massively parallel soc architectures
ARCS'07 Proceedings of the 20th international conference on Architecture of computing systems
Runtime Reconfiguration of Multiprocessors Based on Compile-Time Analysis
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Simplifying data path processing in next-generation routers
Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
GigaNetIC – a scalable embedded on-chip multiprocessor architecture for network applications
ARCS'06 Proceedings of the 19th international conference on Architecture of Computing Systems
Instruction set architectural guidelines for embedded packet-processing engines
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.01 |
This paper addresses the design automation of instruction set extensions for application-specific processors with emphasis on network processing. Within this domain, increasing performance demands and the ongoing development of network protocols both call for flexible and performance-optimized processors. Our approach represents a holistic methodology for the extension and optimization of a processorýs instruction set. The starting point is a concise yet powerful processor abstraction, which is well suited to automatically generate the important parts of a compiler backend and cycle-accurate simulator so that domain-characteristic benchmarks can be analyzed for frequently occurring instruction pairs. These instruction pairs are promising candidates for the extension of the instruction set by means of super-instructions. Provided that a new super-instruction meets a given performance threshold, a fine-grained performance re-evaluation of the adapted processor design can be conducted instantly. With respect to the chosen domain-characteristic benchmark, the tool-chain pinpoints important characteristics such as execution performance, energy consumption, or chip area of the extended design. Using this holistic design methodology, we are able to judge a refinement of the processor rapidly.