Throughput-memory footprint trade-off in synthesis of streaming software on embedded multiprocessors

Authors:
Matin Hashemi;Mohammad H. Foroozannejad;Soheil Ghiasi
Affiliations:
Sharif University of Technology;University of California, Davis;University of California, Davis
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2013

Citing 17
Cited 0

Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Software pipelining showdown: optimal vs. heuristic methods in a production compiler

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Software Synthesis from Dataflow Graphs

Software Synthesis from Dataflow Graphs
Exploring Embedded-Systems Architectures with Artemis

Computer
Buffer merging—a powerful technique for reducing memory requirements of synchronous dataflow specifications

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Reactive process networks

Proceedings of the 4th ACM international conference on Embedded software
Dynamic partitioning of processing and memory resources in embedded MPSoC architectures

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Deterministic parallel processing

International Journal of Parallel Programming
Beyond single-appearance schedules: Efficient DSP software synthesis using nested procedure calls

ACM Transactions on Embedded Computing Systems (TECS) - SPECIAL ISSUE SCOPES 2005
Orchestrating the execution of stream programs on multicore platforms

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Throughput-Buffering Trade-Off Exploration for Cyclo-Static and Synchronous Dataflow Graphs

IEEE Transactions on Computers
Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures

PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Cache Hierarchy and Memory Subsystem of the AMD Opteron Processor

IEEE Micro
A framework for automatic parallelization, static and dynamic memory optimization in MPSoC platforms

Proceedings of the 47th Design Automation Conference
Simultaneous budget and buffer size computation for throughput-constrained task graphs

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the trade-off between throughput and memory footprint of embedded software that is synthesized from acyclic static dataflow (task graph) specifications targeting distributed memory multiprocessors. We identify iteration overlapping as a knob in the synthesis process by which one can trade application throughput for its memory requirement. Given an initial processor assignment and non-overlapped task schedule, we formally present underlying properties of the problem, such as constraints on a valid iteration overlapping, maximum possible throughput, and minimum memory footprint. Moreover, we develop an effective algorithm for generation of a rich set of design points that provide a range of trade-off options. Experimental results on a number of applications and architectures validate the effectiveness of our approach.