Mapping deep nested do-loop DSP algorithms to large scale FPGA array structures

Authors:
Surin Kittitornkun;Yu Hen Hu
Affiliations:
Department of Electrical and Computer Engineering, University of Wisconsin, Madison, WI;Department of Electrical and Computer Engineering, University of Wisconsin, Madison, WI
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2003

Citing 15
Cited 2

VLSI array processors

VLSI array processors
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms

IEEE Transactions on Computers
Time Optimal Linear Schedules for Algorithms with Uniform Dependencies

IEEE Transactions on Computers
The Organization of Computations for Uniform Recurrence Equations

Journal of the ACM (JACM)
Speed and area tradeoffs in cluster-based FPGA architectures

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
DG2VHDL: A Tool to Facilitate the High Level Synthesisof Parallel Processing Array Architectures

Journal of VLSI Signal Processing Systems - Special issue on recent advances in the design and implementation of signal processing systems
The parallel execution of DO loops

Communications of the ACM
Fine-grained and coarse-grained behavioral partitioning with effective utilization of memory and design space exploration for multi-FPGA architectures

IEEE Transactions on Very Large Scale Integration (VLSI) Systems - Special issue on low power electronics and design
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
Application of Reconfigurable Computing to a High Performance Front-End Radar Signal Processor

Journal of VLSI Signal Processing Systems
A MATLAB Compiler for Distributed, Heterogeneous, Reconfigurable Computing Systems

FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Configurable logic for digital communications: some signal processing perspectives

IEEE Communications Magazine
Pipeline vectorization

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Frame-level pipelined motion estimation array processor

IEEE Transactions on Circuits and Systems for Video Technology
A novel modular systolic array architecture for full-search block matching motion estimation

IEEE Transactions on Circuits and Systems for Video Technology

Design space exploration of deeply nested loop 2D filtering and 6 level FSBM algorithm mapped onto systolic array

VLSI Design
A direct method for optimal VLSI realization of deeply nested n-D loop problems

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, FPGAs (field programmable gate arrays) technology have made significant advances in both speed and capacity. Millions of logic gates are now available for reconfiguration programming. To fully exploit the potential of so many programmable devices, powerful design methodology must be developed. In this paper, we propose a novel systematic computer-aided design methodology that can efficiently implement deeply nested do-loop algorithms on a FPGA. Specifically, our design methodology maps the loop dependence graph onto a linear array of locally connected processing elements to exploit parallelism. Due to the regular structure of this linear array of processors, it can be easily implemented on a FPGA. While this method is based on conventional systolic array design methodology, our proposed approach exhibits two distinct features that contribute to its superior performance: 1) We developed a novel multiple-order dependence graph representation that is able to efficiently represent distinct, yet correct algorithm execution orders. 2) We developed new FPGA-specific architectural constraints during the mapping process. As such, FPGA implementations based on our approach will utilize much fewer lookup tables while achieving superior performance.