Static scheduling of synchronous data flow programs for digital signal processing
IEEE Transactions on Computers
Performance analysis and optimization of VLSI dataflow arrays
Journal of Parallel and Distributed Computing
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Warp: an integrated solution of high-speed parallel computing
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Introduction to algorithms
Architectures for statically scheduled dataflow
Journal of Parallel and Distributed Computing - Special issue: data-flow processing
Static Rate-Optimal Scheduling of Iterative Data-Flow Programs Via Optimum Unfolding
IEEE Transactions on Computers
Multiprocessor scheduling to account for interprocessor communication
Multiprocessor scheduling to account for interprocessor communication
ACM Computing Surveys (CSUR)
Resource constrained scheduling of uniform algorithms
Journal of VLSI Signal Processing Systems
Scheduling Parallel Computations
Journal of the ACM (JACM)
Petri Net Theory and the Modeling of Systems
Petri Net Theory and the Modeling of Systems
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
Fast Prototyping of Datapath-Intensive Architectures
IEEE Design & Test
Constructive Methods for Scheduling Uniform Loop Nests
IEEE Transactions on Parallel and Distributed Systems
A coupled hardware and software architecture for programmable digital signal processors (synchronous data flow)
Dynamic Memory Oriented Transformations in the MPEG4 IM1-Player on a Low Power Platform
PACS '00 Proceedings of the First International Workshop on Power-Aware Computer Systems-Revised Papers
An architectural level design methodology for embedded face detection
CODES+ISSS '05 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Communication strategies for shared-bus embedded multiprocessors
Proceedings of the 5th ACM international conference on Embedded software
Throughput Constraint for Synchronous Data Flow Graphs
CPAIOR '09 Proceedings of the 6th International Conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems
Proceedings of the Conference on Design, Automation and Test in Europe
Maximum-throughput mapping of SDFGs on multi-core SoC platforms
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
This paper addresses embedded multiprocessor implementationof iterative, real-time applications, such as digital signal andimage processing, that are specified as dataflow graphs. Schedulingdataflow graphs on multiple processors involves assigning tasks toprocessors (processor assignment), ordering the execution of taskswithin each processor (task ordering), and determining when each taskmust commence execution. We consider three scheduling strategies:fully-static, self-timed and ordered transactions, all of whichperform the assignment and ordering steps at compile time. Run timecosts are small for the fully-static strategy; however it is notrobust with respect to changes or uncertainty in task executiontimes. The self-timed approach is tolerant of variations in taskexecution times, but pays the penalty of high run time costs, becauseprocessors need to explicitly synchronize whenever they communicate.The ordered transactions approach lies between the fully-static andself-timed strategies; in this approach the order in which processorscommunicate is determined at compile time and enforced at run time.The ordered transactions strategy retains some of the flexibility ofself-timed schedules and at the same time has lower run time coststhan the self-timed approach.In this paper we determine an order of processor transactions that isnearly optimal given information about task execution times atcompile time, and for a given processor assignment and task ordering.The criterion for optimality is the average throughput achieved bythe schedule. Our main result is that it is possible to choose atransaction order such that the resulting ordered transactionsschedule incurs no performance penalty compared to the more flexibleself-timed strategy, even when the higher run time costs implied bythe self-timed strategy are ignored.