Bulldog: a compiler for VLSI architectures
Bulldog: a compiler for VLSI architectures
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Effective cluster assignment for modulo scheduling
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Modulo scheduling for a fully-distributed clustered VLIW architecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Code Optimization Techniques for Embedded Processors: Methods, Algorithms, and Tools
Code Optimization Techniques for Embedded Processors: Methods, Algorithms, and Tools
Retargetable Code Generation for Digital Signal Processors
Retargetable Code Generation for Digital Signal Processors
Code Generation for Embedded Processors
Code Generation for Embedded Processors
Graph-partitioning based instruction scheduling for clustered processors
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
PICO-NPA: High-Level Synthesis of Nonprogrammable Hardware Accelerators
Journal of VLSI Signal Processing Systems
Compilation Approach for Coarse-Grained Reconfigurable Architectures
IEEE Design & Test
The MorphoSys Parallel Reconfigurable System
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Region-based hierarchical operation partitioning for multicluster processors
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Mapping applications to the RaPiD configurable architecture
FCCM '97 Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines
CARS: A New Code Generation Framework for Clustered ILP Processors
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
The Reconfigurable Streaming Vector Processor (RSVPTM)
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
FLASH: Foresighted Latency-Aware Scheduling Heuristic for Processors with Customized Datapaths
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A loop accelerator for low power embedded VLIW processors
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Increasing hardware efficiency with multifunction loop accelerators
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Modulo graph embedding: mapping applications onto coarse-grained reconfigurable architectures
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Copy or Discard execution model for speculative parallelization on multicores
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Playing the trade-off game: Architecture exploration using Coffeee
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Speculative parallelization of sequential loops on multicores
International Journal of Parallel Programming
An energy-efficient patchable accelerator for post-silicon engineering changes
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A general constraint-centric scheduling framework for spatial architectures
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
SDC-based modulo scheduling for pipeline synthesis
Proceedings of the International Conference on Computer-Aided Design
Hi-index | 0.00 |
In the embedded domain, custom hardware in the form of ASICs is often used to implement critical parts of applications when performance and energy efficiency goals cannot be met with software implementations on a general purpose processor or DSP. The downsides of using ASICs include high non-recurring engineering costs, inability to accommodate changes in the application after production, and inability to reuse hardware for new applications. However, by allowing a degree of post-programmability, the hardware can retain high performance and energy efficiency while increasing flexibility and reusability. The difficulty with programmable custom hardware lies in mapping new applications onto an existing datapath that is both sparse and irregular. This paper proposes a constraint-driven modulo scheduler that maps software-pipelineable loops onto programmable loop accelerator hardware. The scheduler is able to target accelerators with widely varying levels of datapath functional capability and connectivity, and thus, varying degrees of programmability. The paper investigates the ability of the scheduler to map new loops onto existing hardware, which depends on both the degree of programmability of the hardware as well as the similarity of the new loop to the original loop for which the hardware was designed.