Performance-constrained pipelining of software loops onto reconfigurable hardware
FPGA '02 Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays
Towards Provably-Correct Hardware Compilation Tools Based on Pass Separation Techniques
CHARME '01 Proceedings of the 11th IFIP WG 10.5 Advanced Research Working Conference on Correct Hardware Design and Verification Methods
Compilation Increasing the Scheduling Scope for Multi-memory-FPGA-Based Custom Computing Machines
FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
Task-Parallel Programming of Reconfigurable Systems
FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
Parameterized Function Evaluation for FPGAs
FPL '01 Proceedings of the 11th International Conference on Field-Programmable Logic and Applications
A Technique for FPGA Synthesis Driven by Automatic Source Code Analysis and Transformations
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
XPP-VC: A C Compiler with Temporal Partitioning for the PACT-XPP Architecture
FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
An algorithm for mapping loops onto coarse-grained reconfigurable architectures
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
PACT XPP—A Self-Reconfigurable Data Processing Architecture
The Journal of Supercomputing
Mapping deep nested do-loop DSP algorithms to large scale FPGA array structures
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
ARCHITECT-R: a system for reconfigurable robots design
Proceedings of the 2003 ACM symposium on Applied computing
Parameterized High Throughput Function Evaluation for FPGAs
Journal of VLSI Signal Processing Systems
From C Programs to the Configure-Execute Model
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Exploiting Program Branch Probabilities in Hardware Compilation
IEEE Transactions on Computers
Customisable Hardware Compilation
The Journal of Supercomputing
Dynamic loop pipelining in data-driven architectures
Proceedings of the 2nd conference on Computing frontiers
A spatial mapping algorithm for heterogeneous coarse-grained reconfigurable architectures
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Proceedings of the 17th ACM Great Lakes symposium on VLSI
An overview of reconfigurable hardware in embedded systems
EURASIP Journal on Embedded Systems
Synchronization after design refinements with sensitive delay elements
CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Optimal Unroll Factor for Reconfigurable Architectures
ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Optimal Loop Unrolling and Shifting for Reconfigurable Architectures
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
A Methodology for Rapid Optimization of HandelC Specifications
RSP '09 Proceedings of the 2009 IEEE/IFIP International Symposium on Rapid System Prototyping
Modern development methods and tools for embedded reconfigurable systems: A survey
Integration, the VLSI Journal
Compiling for reconfigurable computing: A survey
ACM Computing Surveys (CSUR)
A design space exploration algorithm in compiling window operation onto reconfigurable hardware
International Journal of Computers and Applications
Partial data reuse for windowing computations: performance modeling for FPGA implementations
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Optimized generation of memory structure in compiling window operations onto reconfigurable hardware
ARC'07 Proceedings of the 3rd international conference on Reconfigurable computing: architectures, tools and applications
Multiloop parallelisation using unrolling and fission
International Journal of Reconfigurable Computing - Special issue on selected papers from spl 2009 programmable logic and applications
Automatic memory partitioning: increasing memory parallelism via data structure partitioning
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
VoCS'08 Proceedings of the 2008 international conference on Visions of Computer Science: BCS International Academic Conference
Using memory profile analysis for automatic synthesis of pointers code
ACM Transactions on Embedded Computing Systems (TECS)
The benefits of using variable-length pipelined operations in high-level synthesis
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.04 |
This paper presents pipeline vectorization, a method for synthesizing hardware pipelines based on software vectorizing compilers. The method improves efficiency and ease of development of hardware designs, particularly for users with little electronics design experience. We propose several loop transformations to customize pipelines to meet hardware resource constraints while maximizing available parallelism. For runtime reconfigurable systems, we apply hardware specialization to increase circuit utilization. Our approach is especially effective for highly repetitive computations in digital signal processor (DSP) and multimedia applications. Case studies using field programmable gate arrays (FPGAs)-based platforms are presented to demonstrate the benefits of our approach and to evaluate tradeoffs between alternative implementations. For instance, the loop-tiling transformation, has been found to improve vectorization performance 30-40 times above a PC-based software implementation, depending on whether runtime reconfiguration (RTR) is used