A memory interface for multi-purpose multi-stream accelerators

  • Authors:
  • Sylvain Girbal;Olivier Temam;Sami Yehia;Hugues Berry;Zheng LI

  • Affiliations:
  • Thales Research and Technology, Palaiseau, France;INRIA Saclay, Orsay, France;Thales Research and Technology, Palaiseau, France;INRIA Saclay, Orsay, France;INRIA Saclay, Orsay, France

  • Venue:
  • CASES '10 Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Power and programming challenges make heterogeneous multi-cores composed of cores and ASICs an attractive alternative to homogeneous multi-cores. Recently, multi-purpose loop-based generated accelerators have emerged as an especially attractive accelerator option. They have several assets: short design time (automatic generation), flexibility (multi-purpose) but low configuration and routing overhead (unlike FPGAs), computational performance (operations are directly mapped to hardware), and a focus on memory throughput by leveraging loop constructs. However, with multiple streams, the memory behavior of such accelerators can become at least as complex as that of superscalar processors, while they still need to retain the memory ordering predictability and throughput efficiency of DMAs. In this article, we show how to design a memory interface for multi-purpose accelerators which combines the ordering predictability of DMAs, retains key efficiency features of memory systems for complex processors, and requires only a fraction of their cost by leveraging the properties of streams references. We evaluate the approach with a synthesizable version of the memory interface for an example 9-task generated loop-based accelerator