Buffer-space efficient and deadlock-free scheduling of stream applications on multi-core architectures

Authors:
Jongsoo Park;William J. Dally
Affiliations:
Stanford University, Stanford, CA, USA;Stanford University, Stanford, CA, USA
Venue:
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Year:
2010

Citing 29
Cited 3

Static scheduling of synchronous data flow programs for digital signal processing

IEEE Transactions on Computers
Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Efficiently computing static single assignment form and the control dependence graph

ACM Transactions on Programming Languages and Systems (TOPLAS)
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Signals & systems (2nd ed.)

Signals & systems (2nd ed.)
Optimizing computations for effective block-processing

ACM Transactions on Design Automation of Electronic Systems (TODAES)
A fast algorithm for finding dominators in a flowgraph

ACM Transactions on Programming Languages and Systems (TOPLAS)
A stream compiler for communication-exposed architectures

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Imagine: Media Processing with Streams

IEEE Micro
StreamIt: A Language for Streaming Applications

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Phased scheduling of stream programs

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
A Hierarchical Multiprocessor Scheduling Framework for

A Hierarchical Multiprocessor Scheduling Framework for
A coupled hardware and software architecture for programmable digital signal processors (synchronous data flow)

A coupled hardware and software architecture for programmable digital signal processors (synchronous data flow)
A programming system for the imagine media processor

A programming system for the imagine media processor
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Principles and Practices of Interconnection Networks

Principles and Practices of Interconnection Networks
Power Efficient Processor Architecture and The Cell Processor

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
MiBench: A free, commercially representative embedded benchmark suite

WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Hierarchical coarse-grained stream compilation for software defined radio

CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Orchestrating the execution of stream programs on multicore platforms

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Memory-constrained block processing for DSP software optimization

Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
An Energy-Efficient Processor Architecture for Embedded Systems

IEEE Computer Architecture Letters
Embedded Multiprocessors: Scheduling and Synchronization

Embedded Multiprocessors: Scheduling and Synchronization
The Art of Multiprocessor Programming

The Art of Multiprocessor Programming
Optimizing synchronization in multiprocessor DSP systems

IEEE Transactions on Signal Processing
Cell search in W-CDMA

IEEE Journal on Selected Areas in Communications

Task Scheduling Algorithm for Multicore Processor System for Minimizing Recovery Time in Case of Single Node Fault

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Cache-conscious scheduling of streaming applications

Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
DANBI: dynamic scheduling of irregular stream programs for many-core systems

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a scheduling algorithm of stream programs for multi-core architectures called team scheduling. Compared to previous multi-core stream scheduling algorithms, team scheduling achieves 1) similar synchronization overhead, 2) coverage of a larger class of applications, 3) better control over buffer space, 4) deadlock-free feedback loops, and 5)lower latency. We compare team scheduling to the latest stream scheduling algorithm, sgms, by evaluating 14 applications on a multi-core architecture with 16 cores. Team scheduling successfully targets applications that cannot be validly scheduled by sgms due to excessive buffer requirement or deadlocks in feedback loops (e.g., gsm and w-cdma). For applications that can be validly scheduled by sgms, team scheduling shows on average 37% higher throughput within the same buffer space constraints.