Stream Scheduling: A Framework to Manage Bulk Operations in Memory Hierarchies

Authors:
Abhishek Das;William J. Dally
Affiliations:
Computer Systems Laboratory, Stanford University,;Computer Systems Laboratory, Stanford University,
Venue:
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Year:
2008

Citing 8
Cited 0

Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
More iteration space tiling

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Imagine: Media Processing with Streams

IEEE Micro
Compiling for stream processing

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Compilation for explicitly managed memory hierarchies

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the emergence of streaming and multi-core architectures, there is an increasing demand to map parallel algorithms efficiently across all architectures. This paper describes a platform-independent optimization framework called Stream Scheduling, that orchestrates parallel execution of bulk computations and data transfers, and allocates storage at multiple levels of a memory hierarchy. By adjusting block sizes, and applying software pipelining on bulk operations, it ensures computation-to-communication ratio is maximized on each level. We evaluate our framework on a diverse set of Sequoia applications, targeting systems with different memory hierarchies: a Cell blade, a distributed-memory cluster, and the Cell blade attached to a disk.