Design-space exploration of resource-sharing solutions for custom instruction set extensions
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Selecting profitable custom instructions for reconfigurable processors
Journal of Systems Architecture: the EUROMICRO Journal
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Efficient datapath merging for the overhead reduction of run-time reconfigurable systems
The Journal of Supercomputing
On the asymptotic costs of multiplexer-based reconfigurability
Proceedings of the 49th Annual Design Automation Conference
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Hardware reuse in modern application-specific processors and accelerators
Microprocessors & Microsystems
Hi-index | 0.00 |
Customised processor performance generally increases as additional custom instructions are added. However, performance is not the only metric that modern systems must take into account; die area and energy efficiency are equally important. Resource sharing during synthesis of instruction set extensions (ISEs) can reduce significantly the die area and energy consumption of a customised processor. This may increase the number of custom instructions that can be synthesized with a given area budget. Resource sharing involves combining the graph representations of two or more ISEs which contain a similar sub-graph. This coupling of multiple sub-graphs, if performed naively, can increase the latency of the extension instructions considerably. And yet, as we show in this paper, an appropriate level of resource sharing provides a significantly simpler design with only modest increases in average latency for extension instructions. Based on existing resource-sharing techniques, this study presents a new heuristic that controls the degree of resource sharing between a given set of custom instructions. Our main contributions are the introduction of a parametric method for exploring the trade-offs that can be achieved between instruction latency and implementation complexity, and the coupling of design-space exploration with fast area-delay models for the operators comprising each ISE. We present experimental evidence that our heuristic exposes a broad range of design points, allowing advantageous trade-offs between die area and latency to be found and exploited.