Unrolling and retiming of stream applications onto embedded multicore processors

Authors:
Weijia Che;Karam S. Chatha
Affiliations:
Arizona State University, Tempe, AZ;Arizona State University, Tempe, AZ
Venue:
Proceedings of the 49th Annual Design Automation Conference
Year:
2012

Citing 15
Cited 1

Loop Shifting for Loop Compaction

International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Synthesis and Optimization of Digital Circuits

Synthesis and Optimization of Digital Circuits
StreamIt: A Language for Streaming Applications

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Data and Computation Transformations for Brook Streaming Applications on Multiprocessors

Proceedings of the International Symposium on Code Generation and Optimization
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
The GeForce 6 series GPU architecture

SIGGRAPH '05 ACM SIGGRAPH 2005 Courses
ILP and heuristic techniques for system-level design on network processor architectures

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Orchestrating the execution of stream programs on multicore platforms

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
MCUDA: An Efficient Implementation of CUDA Kernels for Multi-core CPUs

Languages and Compilers for Parallel Computing
Stream Compilation for Real-Time Embedded Multicore Systems

Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Flextream: Adaptive Compilation of Streaming Applications for Heterogeneous Architectures

PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
Compilation of stream programs for multicore processors that incorporate scratchpad memories

Proceedings of the Conference on Design, Automation and Test in Europe
Sponge: portable stream programming on graphics engines

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Compilation of stream programs onto scratchpad memory based embedded multicore processors through retiming

Proceedings of the 48th Design Automation Conference

Mapping on multi/many-core systems: survey of current and emerging trends

Proceedings of the 50th Annual Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream applications distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to achieve high performance and low power, scratch pad memory (SPM) has been introduced in today's embedded multicore processors. Programing on SPM based architecture is both challenging and time consuming. In this paper we address the problem of automatic compilation of stream applications onto SPM based embedded multicore processors through unrolling and retiming. In our technique, code overlay and data overlay are implemented to overcome the limited SPM capacity. Smart double buffering and code prefetching are introduced to amortize memory access delays. We evaluated the efficiency of our technique through compiling several stream applications onto the IBM Cell processor and compared their performance with existing approaches.