CHIPS: Custom Hardware Instruction Processor Synthesis

Authors:
K. Atasu;C. Ozturan;G. Dundar;O. Mencer;W. Luk
Affiliations:
Bogazici Univ., Istanbul;-;-;-;-
Venue:
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Year:
2008

Citing 0
Cited 10

An Application Development Framework for ARISE Reconfigurable Processors

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Modern development methods and tools for embedded reconfigurable systems: A survey

Integration, the VLSI Journal
Practical and effective domain-specific function unit design for CGRA

ICCSA'11 Proceedings of the 2011 international conference on Computational science and Its applications - Volume Part V
Instruction set architectural guidelines for embedded packet-processing engines

Journal of Systems Architecture: the EUROMICRO Journal
On the asymptotic costs of multiplexer-based reconfigurability

Proceedings of the 49th Annual Design Automation Conference
Improving communication latency with the write-only architecture

Journal of Parallel and Distributed Computing
Parallel partitioning for distributed systems using sequential assignment

Journal of Parallel and Distributed Computing
Architecture support for custom instructions with memory operations

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
Complexity of computing convex subgraphs in custom instruction synthesis

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Rapid evaluation of custom instruction selection approaches with FPGA estimation

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.03

Visualization

Abstract

This paper describes an integer-linear-programming (ILP)-based system called custom hardware instruction processor synthesis (CHIPS) that identifies custom instructions for critical code segments, given the available data bandwidth and transfer latencies between custom logic and a baseline processor with architecturally visible state registers. Our approach enables designers to optionally constrain the number of input and output operands for custom instructions. We describe a design flow to identify promising area, performance, and code-size tradeoffs. We study the effect of input/output constraints, register-file ports, and compiler transformations such as if-conversion. Our experiments show that, in most cases, the solutions with the highest performance are identified when the input/output constraints are removed. However, input/output constraints help our algorithms identify frequently used code segments, reducing the overall area overhead. Results for 11 benchmarks covering cryptography and multimedia are shown, with speed-ups between 1.7 and 6.6 times, code-size reductions between 6% and 72%, and area costs ranging between 12 and 256 adders for maximum speed-up. Our ILP-based approach scales well: benchmarks with basic blocks consisting of more than 1000 instructions can be optimally solved, most of the time within a few seconds.