Fast shared on-chip memory architecture for efficient hybrid computing with CGRAs

  • Authors:
  • Jongeun Lee;Yeonghun Jeong;Sungsok Seo

  • Affiliations:
  • Ulsan National Institute of Science and Technology, Ulsan, Korea;Ulsan National Institute of Science and Technology, Ulsan, Korea;Ulsan National Institute of Science and Technology, Ulsan, Korea

  • Venue:
  • Proceedings of the Conference on Design, Automation and Test in Europe
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

While Coarse-Grained Reconfigurable Architectures (CGRAs) are very efficient at handling regular, compute-intensive loops, their weakness at control-intensive processing and the need for frequent reconfiguration require another processor, for which usually a main processor is used. To minimize the overhead arising in such collaborative execution, we integrate a dedicated sequential processor (SP) with a reconfigurable array (RA), where the crucial problem is how to share the memory between SP and RA while keeping the SP's memory access latency very short. We present a detailed architecture, control, and program example of our approach, focusing on our optimized on-chip shared memory organization between SP and RA. Our preliminary results demonstrate that our optimized memory architecture is very effective in reducing kernel execution times (23.5% compared to a more straightforward alternative), and our approach can reduce the RA control overhead and other sequential code execution time in kernels significantly, resulting in up to 23.1% reduction in kernel execution time, compared to the conventional system using the main processor for sequential code execution.