Resource Sharing in Custom Instruction Set Extensions

  • Authors:
  • Marcela Zuluaga;Nigel Topham

  • Affiliations:
  • Institute for Computing Systems Architecture, School of Informatics, University of Edinburgh. g.m.zuluaga@sms.ed.ac.uk;Institute for Computing Systems Architecture, School of Informatics, University of Edinburgh. npt@inf.ed.ac.uk

  • Venue:
  • SASP '08 Proceedings of the 2008 Symposium on Application Specific Processors
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Customised processor performance generally increases as additional custom instructions are added. However, performance is not the only metric that modern systems must take into account; die area and energy efficiency are equally important. Resource sharing during synthesis of instruction set extensions (ISEs) can reduce significantly the die area and energy consumption of a customised processor. This may increase the number of custom instructions that can be synthesized with a given area budget. Resource sharing involves combining the graph representations of two or more ISEs which contain a similar sub-graph. This coupling of multiple sub-graphs, if performed naively, can increase the latency of the extension instructions considerably. And yet, as we show in this paper, an appropriate level of resource sharing provides a significantly simpler design with only modest increases in average latency for extension instructions. Based on existing resource-sharing techniques, this study presents a new heuristic that controls the degree of resource sharing between a given set of custom instructions. Our main contributions are the introduction of a parametric method for exploring the trade-offs that can be achieved between instruction latency and implementation complexity, and the coupling of design-space exploration with fast area-delay models for the operators comprising each ISE. We present experimental evidence that our heuristic exposes a broad range of design points, allowing advantageous trade-offs between die area and latency to be found and exploited.