DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
The MOLEN Polymorphic Processor
IEEE Transactions on Computers
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 3 - Volume 04
QUKU: A Two-Level Reconfigurable Architecture
ISVLSI '06 Proceedings of the IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures
Designing Embedded Processors: A Low Power Perspective
Designing Embedded Processors: A Low Power Perspective
A direct method for optimal VLSI realization of deeply nested n-D loop problems
Microprocessors & Microsystems
Integration, the VLSI Journal
Hi-index | 0.00 |
Signal processors exploiting ASIC acceleration suffer from sky-rocketing manufacturing costs and long design cycles. FPGA-based systems provide a programmable alternative for exploiting computation parallelism, but the flexibility they provide is not as high as in processor-oriented architectures: HDL or C-to-HDL flows still require specific expertise and a hardware knowledge background. On the other hand, the large size of the configuration bitstream and the inherent complexity of FPGA devices make their dynamic reconfiguration not a very viable approach. Coarse-grained reconfigurable architectures (CGRAs) are an appealing solution but they pose implementation problems and tend to be application specific. This paper presents a scalable CGRA which eases the implementation of algorithms on field programmable gate array (FPGA) platforms. This design option is based on two levels of programmability: it takes advantage of performance and reliability provided by state-of-the-art FPGA technology, and at the same time it provides the user with flexibility, performance and ease of reconfiguration typical of standard CGRAs. The basic cell template provides advanced features such as sub-word SIMD integer and floating-point computation capabilities, as well as saturating arithmetic. Multiple reconfiguration contexts and partial run-time reconfiguration capabilities are provided, tackling this way the problem of high reconfiguration overhead typical of FPGAs. Selected instances of the proposed architecture have been implemented on an Altera Stratix II EP2S180 FPGA. On this system, we mapped some common DSP, image processing, 3D graphics and audio compression algorithms in order to validate our approach and to demonstrate its effectiveness by benchmarking the benefits achieved.