Heterogeneous coarse-grained processing elements: a template architecture for embedded processing acceleration

  • Authors:
  • Giovanni Ansaloni;Paolo Bonzini;Laura Pozzi

  • Affiliations:
  • University of Lugano (USI), Switzerland;University of Lugano (USI), Switzerland;University of Lugano (USI), Switzerland

  • Venue:
  • Proceedings of the Conference on Design, Automation and Test in Europe
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reconfigurable Architectures are good candidates for application accelerators that cannot be set in stone at production time. FPGAs however, often suffer from the area and performance penalty intrinsic in gate-level reconfigurability. To reduce this overhead, coarse-grained reconfigurable arrays (CGRAs) are reconfigurable at the ALU level, but a successful design needs more than computational power---the main bottleneck usually being memory transfers. Just like the integration of hardwired multiplier and memory blocks enabled FPGAs to efficiently implement digital signal processing applications, in this paper we study a customizable architecture template based on heterogeneous processing elements (multipliers, ALU clusters and memories) that provides enough flexibility to realize fast pipelined implementations of various loop kernels on a CGRA.