A coarse-grain reconfigurable architecture for multimedia applications supporting subword and floating-point calculations

  • Authors:
  • Claudio Brunelli;Fabio Garzia;Davide Rossi;Jari Nurmi

  • Affiliations:
  • Tampere University of Technology, Department of Computer Systems, P.O. Box 553, FIN-33101 Tampere, Finland and Nokia Research Center, Itämerenkatu 11 - 13, 00180 Helsinki, Finland;Tampere University of Technology, Department of Computer Systems, P.O. Box 553, FIN-33101 Tampere, Finland;ARCES Laboratories, University of Bologna, Viale Pepoli 3/2, 40136, Italy;Tampere University of Technology, Department of Computer Systems, P.O. Box 553, FIN-33101 Tampere, Finland

  • Venue:
  • Journal of Systems Architecture: the EUROMICRO Journal
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Signal processors exploiting ASIC acceleration suffer from sky-rocketing manufacturing costs and long design cycles. FPGA-based systems provide a programmable alternative for exploiting computation parallelism, but the flexibility they provide is not as high as in processor-oriented architectures: HDL or C-to-HDL flows still require specific expertise and a hardware knowledge background. On the other hand, the large size of the configuration bitstream and the inherent complexity of FPGA devices make their dynamic reconfiguration not a very viable approach. Coarse-grained reconfigurable architectures (CGRAs) are an appealing solution but they pose implementation problems and tend to be application specific. This paper presents a scalable CGRA which eases the implementation of algorithms on field programmable gate array (FPGA) platforms. This design option is based on two levels of programmability: it takes advantage of performance and reliability provided by state-of-the-art FPGA technology, and at the same time it provides the user with flexibility, performance and ease of reconfiguration typical of standard CGRAs. The basic cell template provides advanced features such as sub-word SIMD integer and floating-point computation capabilities, as well as saturating arithmetic. Multiple reconfiguration contexts and partial run-time reconfiguration capabilities are provided, tackling this way the problem of high reconfiguration overhead typical of FPGAs. Selected instances of the proposed architecture have been implemented on an Altera Stratix II EP2S180 FPGA. On this system, we mapped some common DSP, image processing, 3D graphics and audio compression algorithms in order to validate our approach and to demonstrate its effectiveness by benchmarking the benefits achieved.