Automatic identification of application-specific functional units with architecturally visible storage

Authors:
Partha Biswas;Nikil Dutt;Paolo Ienne;Laura Pozzi
Affiliations:
University of California, Irvine, CA;University of California, Irvine, CA;Ecole Polytechnique Fédérale de Lausanne Lausanne, Switzerland;University of Lugano, Lugano, Switzerland
Venue:
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Year:
2006

Citing 17
Cited 9

A high-performance microarchitecture with hardware-programmable functional units

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Register promotion in C programs

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
Advanced compiler design and implementation

Advanced compiler design and implementation
Synthesis of application-specific memories for power optimization in embedded systems

Proceedings of the 37th Annual Design Automation Conference
CHIMAERA: a high-performance architecture with a tightly-coupled reconfigurable functional unit

Proceedings of the 27th annual international symposium on Computer architecture
Adapting software pipelining for reconfigurable computing

CASES '00 Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
Hardware/software instruction set configurability for system-on-chip processors

Proceedings of the 38th annual Design Automation Conference
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators

Journal of VLSI Signal Processing Systems
Synthesis of custom processors based on extensible platforms

Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Automatic application-specific instruction-set extensions under microarchitectural constraints

Proceedings of the 40th annual Design Automation Conference
Assigning Program and Data Objects to Scratchpad for Energy Reduction

Proceedings of the conference on Design, automation and test in Europe
Processor Acceleration Through Automated Instruction Set Customization

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Application-specific instruction generation for configurable processor architectures

FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Introduction of local memory elements in instruction set extensions

Proceedings of the 41st annual Design Automation Conference
ISEGEN: Generation of High-Quality Instruction Set Extensions by Iterative Improvement

Proceedings of the conference on Design, Automation and Test in Europe - Volume 2
Exact and approximate algorithms for the extension of embedded processor instruction sets

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Code transformation strategies for extensible embedded processors

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Increasing data-bandwidth to instruction-set extensions through register clustering

Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
Enhancing energy efficiency of processor-based embedded systems through post-fabrication ISA extension

Proceedings of the 13th international symposium on Low power electronics and design
Recurrence-aware instruction set selection for extensible embedded processors

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Modern development methods and tools for embedded reconfigurable systems: A survey

Integration, the VLSI Journal
Heterogeneous coarse-grained processing elements: a template architecture for embedded processing acceleration

Proceedings of the Conference on Design, Automation and Test in Europe
The Instruction-Set Extension Problem: A Survey

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Accelerating loops for coarse grained reconfigurable architectures using instruction extensions

Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Improving performance and energy efficiency of embedded processors via post-fabrication instruction set customization

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Instruction Set Extensions (ISEs) can be used effectively to accelerate the performance of embedded processors. The critical, and difficult task of ISE selection is often performed manually by designers. A few automatic methods for ISE generation have shown good capabilities, but are still limited in the handling of memory accesses, and so they fail to directly address the memory wall problem. We present here the first ISE identification technique that can automatically identify state-holding Application-specific Functional Units (AFUs) comprehensively, thus being able to eliminate a large portion of memory traffic from cache and main memory. Our cycle-accurate results obtained by the SimpleScalar simulator show that the identified AFUs with architecturally visible storage gain significantly more than previous techniques, and achieve an average speedup of 2.8x over pure software execution. Moreover, the number of required memory-access instructions is reduced by two thirds on average, suggesting corresponding benefits on energy consumption.