SoC Memory Hierarchy Derivation from Dataflow Graphs

Authors:
Scott Fischaber;Roger Woods;John Mcallister
Affiliations:
Programmable Systems Laboratory, Institute of Electronics, Communication and Information Technology (ECIT), Queen's University Belfast, Queen's Island, UK BT3 9DT;Programmable Systems Laboratory, Institute of Electronics, Communication and Information Technology (ECIT), Queen's University Belfast, Queen's Island, UK BT3 9DT;Programmable Systems Laboratory, Institute of Electronics, Communication and Information Technology (ECIT), Queen's University Belfast, Queen's Island, UK BT3 9DT
Venue:
Journal of Signal Processing Systems
Year:
2010

Citing 13
Cited 0

MPEG Handbook

MPEG Handbook
Stream-Oriented FPGA Computing in the Streams-C High Level Language

FCCM '00 Proceedings of the 2000 IEEE Symposium on Field-Programmable Custom Computing Machines
Buffer merging—a powerful technique for reducing memory requirements of synchronous dataflow specifications

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Modeling and FPGA Implementation of Applications Using Parameterized Process Networks with Non-Static Parameters

FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
UML-based multiprocessor SoC design framework

ACM Transactions on Embedded Computing Systems (TECS)
Hardware-Software Codesign of Multimedia Embedded Systems: the PeaCE

RTCSA '06 Proceedings of the 12th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications
High-Performance Embedded Computing: Architectures, Applications, and Methodologies

High-Performance Embedded Computing: Architectures, Applications, and Methodologies
A framework for rapid system-level exploration, synthesis, and programming of multimedia MP-SoCs

CODES+ISSS '07 Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Rapid implementation and optimisation of DSP systems on SoPC heterogeneous platforms

SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Multidimensional synchronous dataflow

IEEE Transactions on Signal Processing
Cycle-static dataflow

IEEE Transactions on Signal Processing
Low power memory storage and transfer organization for the MPEG-4full pel motion estimation on a multimedia processor

IEEE Transactions on Multimedia
On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture

IEEE Transactions on Circuits and Systems for Video Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hardware synthesis from dataflow graphs of signal processing systems is a growing research area as focus shifts to high level design methodologies. For data intensive systems, dataflow based synthesis can lead to an inefficient usage of memory due to the restrictive nature of synchronous dataflow and its inability to easily model data reuse. This paper explores how dataflow graph changes can be used to drive both the on-chip and off-chip memory organisation and how these memory architectures can be mapped to a hardware implementation. By exploiting the data reuse inherent to many image processing algorithms and by creating memory hierarchies, off-chip memory bandwidth can be reduced by a factor of a thousand from the original dataflow graph level specification of a motion estimation algorithm, with a minimal increase in memory size. This analysis is verified using results gathered from implementation of the motion estimation algorithm on a Xilinx Virtex-4 FPGA, where the delay between the memories and processing elements drops from 14.2 ns down to 1.878 ns through the refinement of the memory architecture. Care must be taken when modeling these algorithms however, as inefficiencies in these models can be easily translated into overuse of hardware resources.