Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Trace cache: a low latency approach to high bandwidth instruction fetching
Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A study of branch prediction strategies
25 years of the international symposia on Computer architecture (selected papers)
Reconfigurable computing: what, why, and implications for design automation
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
Dataflow Partitioning and Scheduling Algorithms for WASMII, a Virtual Hardware
FPL '00 Proceedings of the The Roadmap to Reconfigurable Computing, 10th International Workshop on Field-Programmable Logic and Applications
Temporal Partitioning and Scheduling for Reconfigurable Computing
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
Multithreaded Architectural Support for Speculative Trace Scheduling in VLIW Processors
Proceedings of the 15th symposium on Integrated circuits and systems design
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
High-level synthesis challenges and solutions for a dynamically reconfigurable processor
Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
IEICE - Transactions on Information and Systems
IEICE - Transactions on Information and Systems
Compiling Techniques for Coarse Grained Runtime Reconfigurable Architectures
ARC '09 Proceedings of the 5th International Workshop on Reconfigurable Computing: Architectures, Tools and Applications
REDEFINE: Runtime reconfigurable polymorphic ASIC
ACM Transactions on Embedded Computing Systems (TECS)
Streaming FFT on REDEFINE-v2: an application-architecture design space exploration
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Hi-index | 0.00 |
In Dynamically Reconfigurable Processors (DRPs), compilation involves breaking an application into sub-tasks for piecewise execution on the fabric. These sub-tasks are sequenced based on data and control dependences. In DRPs, sub-task prefetching is used to hide the reconfiguration time while another sub-task executes. In REDEFINE, our target DRP, subtasks are referred to as HyperOps. Determining the successor for a HyperOp requires merging information from the control flow graph and the HyperOp dataflow graph. Succession in many cases is data dependent. Since hardware branch predictors cannot be applied due to the non-binary branches, we employ a speculative prefetch unit together with a profile based prediction scheme. Simulation results show around 7-33% reduction in overall execution time, when compared to the execution time without prefetching. We observe better performance when fewer resources on the fabric are used to execute prefetched HyperOps.