Parallelizing DSP nested loops on reconfigurable architectures using data context switching

Authors:
Kiran Bondalapati
Affiliations:
Chameleon Systems, Inc., 161 Nortech Parkway, San Jose, CA
Venue:
Proceedings of the 38th annual Design Automation Conference
Year:
2001

Citing 3
Cited 10

Re-configurable computing in wireless

Proceedings of the 38th annual Design Automation Conference
Loop Pipelining and Optimization for Run Time Reconfiguration

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Mapping Loops onto Reconfigurable Architectures

FPL '98 Proceedings of the 8th International Workshop on Field-Programmable Logic and Applications, From FPGAs to Computing Paradigm

Re-configurable computing in wireless

Proceedings of the 38th annual Design Automation Conference
Compilation Approach for Coarse-Grained Reconfigurable Architectures

IEEE Design & Test
An algorithm for mapping loops onto coarse-grained reconfigurable architectures

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Optimizing code parallelization through a constraint network based approach

Proceedings of the 43rd annual Design Automation Conference
High-level synthesis challenges and solutions for a dynamically reconfigurable processor

Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Slicing based code parallelization for minimizing inter-processor communication

CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
FleXilicon architecture and its VLSI implementation

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Compiling for reconfigurable computing: A survey

ACM Computing Surveys (CSUR)
Data locality and parallelism optimization using a constraint-based approach

Journal of Parallel and Distributed Computing
Improving performance of nested loops on reconfigurable array processors

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reconfigurable architectures promise significant performance and flexibility advantages over conventional architectures. Automatic mapping techniques that exploit the features of the hardware are needed to leverage the power of these architectures. In this paper, we develop techniques for parallelizing nested loop computations from digital signal processing (DSP) applications onto high performance pipelined configurations. We propose a novel data context switching technique that exploits the embedded distributed memory available in reconfigurable architectures to parallelize such loops. Our technique is demonstrated on two diverse state-of-the-art reconfigurable architectures, namely, Virtex and the Chameleon Systems Reconfigurable Communications Processor. Our techniques show significant performance improvements on both architectures and also perform better than state-of-the-art DSP and microprocessor architectures.