Static scheduling of synchronous data flow programs for digital signal processing
IEEE Transactions on Computers
Elimination algorithms for data flow analysis
ACM Computing Surveys (CSUR)
Approximation algorithms
Software Synthesis from Dataflow Graphs
Software Synthesis from Dataflow Graphs
First version of a data flow procedure language
Programming Symposium, Proceedings Colloque sur la Programmation
StreamIt: A Language for Streaming Applications
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Phased scheduling of stream programs
Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Cg: a system for programming graphics hardware in a C-like language
ACM SIGGRAPH 2003 Papers
Brook for GPUs: stream computing on graphics hardware
ACM SIGGRAPH 2004 Papers
Power Efficient Processor Architecture and The Cell Processor
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Stream Programming on General-Purpose Processors
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
ACM Turing award lectures
Streamflex: high-throughput stream programming in java
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Orchestrating the execution of stream programs on multicore platforms
Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A lightweight streaming layer for multicore execution
ACM SIGARCH Computer Architecture News
Synergistic execution of stream programs on multicores with accelerators
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Software Pipelined Execution of Stream Programs on GPUs
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Mapping stream programs onto heterogeneous multiprocessor systems
CASES '09 Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems
Language and compiler support for stream programs
Language and compiler support for stream programs
Minimizing communication in rate-optimal software pipelining for stream programs
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
Computer Systems: A Programmer's Perspective
Computer Systems: A Programmer's Perspective
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A programming model for an embedded media processing architecture
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Profile-guided deployment of stream programs on multicores
Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Cache-conscious scheduling of streaming applications
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
StreamPI: a stream-parallel programming extension for object-oriented programming languages
The Journal of Supercomputing
Kernel Partitioning of Streaming Applications: A Statistical Approach to an NP-complete Problem
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Exploiting just-enough parallelism when mapping streaming applications in hard real-time systems
Proceedings of the 50th Annual Design Automation Conference
Orchestrating stream graphs using model checking
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
We present a novel 2-approximation algorithm for deploying stream graphs on multicore computers and a stream graph transformation that eliminates bottlenecks. The key technical insight is a data rate transfer model that enables the computation of a "closed form", i.e., the data rate transfer function of an actor depending on the arrival rate of the stream program. A combinatorial optimization problem uses the closed form to maximize the throughput of the stream program. Although the problem is inherently NP-hard, we present an efficient and effective 2-approximation algorithm that provides a lower bound on the quality of the solution. We introduce a transformation that uses the closed form to identify and eliminate bottlenecks. We show experimentally that state-of-the art integer linear programming approaches for orchestrating stream graphs are (1) intractable or at least impractical for larger stream graphs and larger number of processors and (2)our 2-approximation algorithm is highly efficient and its results are close to the optimal solution for a standard set of StreamIt benchmark programs.