Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Limits of control flow on parallelism
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Instruction generation for hybrid reconfigurable systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Instruction-Level Parallelism for Reconfigurable Computing
FPL '98 Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm
Automatic application-specific instruction-set extensions under microarchitectural constraints
Proceedings of the 40th annual Design Automation Conference
Automatic generation of application specific processors
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Scalable custom instructions identification for instruction-set extensible processors
Proceedings of the 2004 international conference on Compilers, architecture, and synthesis for embedded systems
ACME: adaptive compilation made efficient
LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors
Proceedings of the 32nd annual international symposium on Computer Architecture
Automated Custom Instruction Generation for Domain-Specific Processor Acceleration
IEEE Transactions on Computers
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Proceedings of the conference on Design, automation and test in Europe: Proceedings
In search of near-optimal optimization phase orderings
Proceedings of the 2006 ACM SIGPLAN/SIGBED conference on Language, compilers, and tool support for embedded systems
Exact and approximate algorithms for the extension of embedded processor instruction sets
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Polynomial-time subgraph enumeration for automated instruction set extension
Proceedings of the conference on Design, automation and test in Europe
Recurrence-aware instruction set selection for extensible embedded processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Code transformation and instruction set extension
ACM Transactions on Embedded Computing Systems (TECS)
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
A study of energy saving in customizable processors
SAMOS'07 Proceedings of the 7th international conference on Embedded computer systems: architectures, modeling, and simulation
Compiler-in-the-loop exploration during datapath synthesis for higher quality delay-area trade-offs
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special section on adaptive power management for energy and temperature-aware computing systems
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Embedded application requirements, including high performance, low power consumption and fast time to market, are uncommon in the broader domain of general purpose applications. In order to satisfy these demands, chip manufacturers often provide developers with the possibility to define application-specific Instruction Set Extensions (ISEs). Many techniques have been proposed that automatically identify the most beneficial ISEs from source code, so that compilers can identify the 'best' instruction set for the underlying machine. However, can we simply retrofit these techniques into a traditional compiler, or does ISE identification demand different tuning of the heuristics utilized throughout the optimization pipeline? In this paper, we show why compilers should sometimes make different decisions when targeting customized processors, and we show how traditional ISE identification techniques can improve significantly if the code is properly transformed in order to expose more beneficial extensions. The proposed approach was validated using the SimpleScalar simulator for the ARM processor, augmented with the possibility to define additional instructions.Using benchmarks taken from the MiBench suite,we show that the proposed transformations improve state of the art ISE identi cation techniques by 55% on average and 4x maximum.