Reconfigurable caches and their application to media processing
Proceedings of the 27th annual international symposium on Computer architecture
On loops, dominators, and dominance frontiers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Exploiting forwarding to improve data bandwidth of instruction-set extensions
Proceedings of the 43rd annual Design Automation Conference
Customizable Embedded Processors: Design Technologies and Applications
Customizable Embedded Processors: Design Technologies and Applications
Architecture and compiler optimizations for data bandwidth improvement in configurable processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Increasing data-bandwidth to instruction-set extensions through register clustering
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Speculative DMA for architecturally visible storage in instruction set extensions
CODES+ISSS '08 Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis
Exact and approximate algorithms for the extension of embedded processor instruction sets
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Introduction of Architecturally Visible Storage in Instruction Set Extensions
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Cost-effectively offering private buffers in SoCs and CMPs
Proceedings of the international conference on Supercomputing
An energy-efficient adaptive hybrid cache
Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
Buffer-integrated-Cache: a cost-effective SRAM architecture for handheld and embedded platforms
Proceedings of the 48th Design Automation Conference
An exploration of mechanisms for dynamic cryptographic instruction set extension
CHES'11 Proceedings of the 13th international conference on Cryptographic hardware and embedded systems
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Hi-index | 0.00 |
This paper introduces Way Stealing, a simple architectural modification to a cache-based processor to increase data bandwidth to and from application-specific Instruction Set Extensions (ISEs). Way Stealing provides more bandwidth to the ISE-logic than the register file alone and does not require expensive coherence protocols, as it does not add memory elements to the processor. When enhanced with Way Stealing, ISE identification flows detect more opportunities for acceleration than prior methods; consequently, Way Stealing can accelerate applications to up to 3.7X, whilst reducing the memory sub-system energy consumption by up to 67%, despite data-cache related restrictions.