Static scheduling of synchronous data flow programs for digital signal processing
IEEE Transactions on Computers
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Signals & systems (2nd ed.)
Optimizing computations for effective block-processing
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A fast algorithm for finding dominators in a flowgraph
ACM Transactions on Programming Languages and Systems (TOPLAS)
A stream compiler for communication-exposed architectures
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Imagine: Media Processing with Streams
IEEE Micro
StreamIt: A Language for Streaming Applications
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Phased scheduling of stream programs
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
A Hierarchical Multiprocessor Scheduling Framework for
A Hierarchical Multiprocessor Scheduling Framework for
A coupled hardware and software architecture for programmable digital signal processors (synchronous data flow)
A programming system for the imagine media processor
A programming system for the imagine media processor
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Hierarchical coarse-grained stream compilation for software defined radio
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Orchestrating the execution of stream programs on multicore platforms
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Memory-constrained block processing for DSP software optimization
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
An Energy-Efficient Processor Architecture for Embedded Systems
IEEE Computer Architecture Letters
Embedded Multiprocessors: Scheduling and Synchronization
Embedded Multiprocessors: Scheduling and Synchronization
The Art of Multiprocessor Programming
The Art of Multiprocessor Programming
Optimizing synchronization in multiprocessor DSP systems
IEEE Transactions on Signal Processing
IEEE Journal on Selected Areas in Communications
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Cache-conscious scheduling of streaming applications
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
DANBI: dynamic scheduling of irregular stream programs for many-core systems
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
We present a scheduling algorithm of stream programs for multi-core architectures called team scheduling. Compared to previous multi-core stream scheduling algorithms, team scheduling achieves 1) similar synchronization overhead, 2) coverage of a larger class of applications, 3) better control over buffer space, 4) deadlock-free feedback loops, and 5)lower latency. We compare team scheduling to the latest stream scheduling algorithm, sgms, by evaluating 14 applications on a multi-core architecture with 16 cores. Team scheduling successfully targets applications that cannot be validly scheduled by sgms due to excessive buffer requirement or deadlocks in feedback loops (e.g., gsm and w-cdma). For applications that can be validly scheduled by sgms, team scheduling shows on average 37% higher throughput within the same buffer space constraints.