Minimizing Buffer Requirements under Rate-Optimal Schedule in Regular Dataflow Networks

Authors:
R. Govindarajan;Guang R. Gao;Palash Desai
Affiliations:
Supercomputer Edn. & Res. Centre, Computer Science & Automation, Indian Institute of Science, Bangalore, 560 012, India;Electrical & Computer Engineering, University of Delaware, Newark, DE 19716, USA;Conductus, Inc., 969 West Maude Ave., Sunnyvale, CA 94085, USA
Venue:
Journal of VLSI Signal Processing Systems
Year:
2002

Citing 30
Cited 19

Static scheduling of synchronous data flow programs for digital signal processing

IEEE Transactions on Computers
Software pipelining: an effective scheduling technique for VLIW machines

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Static Rate-Optimal Scheduling of Iterative Data-Flow Programs Via Optimum Unfolding

IEEE Transactions on Computers
Compile-Time Scheduling and Assignment of Data-Flow Program Graphs with Data-Dependent Iteration

IEEE Transactions on Computers
Automatic mapping of large signal processing systems to a parallel machine

Automatic mapping of large signal processing systems to a parallel machine
Generation of maximum independent sets of a bipartite graph and maximum cliques of a circular-arc graph

Journal of Algorithms
Multiprocessor scheduling to account for interprocessor communication

Multiprocessor scheduling to account for interprocessor communication
A novel framework of register allocation for software pipelining

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compiling real-time digital signal processing applications onto multiprocessor systems

Compiling real-time digital signal processing applications onto multiprocessor systems
Scheduling synchronous dataflow graphs for efficient looping

Journal of VLSI Signal Processing Systems
Rate-optimal schedule for multi-rate DSP computations

Journal of VLSI Signal Processing Systems - Special issue on application-specific array processors
Software pipelining

ACM Computing Surveys (CSUR)
Lower bounds on memory requirements for statically scheduled DSP programs

Journal of VLSI Signal Processing Systems
A Framework for Resource-Constrained Rate-Optimal Software Pipelining

IEEE Transactions on Parallel and Distributed Systems
Joint Minimization of Code and Data for Synchronous DataflowPrograms

Formal Methods in System Design
Efficient formulation for optimal modulo schedulers

Proceedings of the ACM SIGPLAN 1997 conference on Programming language design and implementation
Circuit Retiming Applied to Decomposed Software Pipelining

IEEE Transactions on Parallel and Distributed Systems
A unified framework for instruction scheduling and mapping for function units with structural hazards

Journal of Parallel and Distributed Computing
Scheduling Parallel Computations

Journal of the ACM (JACM)
Consistency in Dataflow Graphs

IEEE Transactions on Parallel and Distributed Systems
Optimal Software Pipelining of Nested Loops

Proceedings of the 8th International Symposium on Parallel Processing
Schedule-Based Multi-Dimensional Retiming on Data Flow Graphs

Proceedings of the 8th International Symposium on Parallel Processing
Buffer Memory Optimization in DSP Applications - An Evolutionary Approach

PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
A Polynomial Time Method for Optimal Software Pipelining

CONPAR '92/ VAPP V Proceedings of the Second Joint International Conference on Vector and Parallel Processing: Parallel Processing
Minimal Memory Schedules for Dataflow Networks

CONCUR '93 Proceedings of the 4th International Conference on Concurrency Theory
First version of a data flow procedure language

Programming Symposium, Proceedings Colloque sur la Programmation
A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs

CC '92 Proceedings of the 4th International Conference on Compiler Construction
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing

MICRO 14 Proceedings of the 14th annual workshop on Microprogramming
A Buffer Merging Technique for Reducing Memory Requirements of Synchronous Dataflow Specifications

Proceedings of the 12th international symposium on System synthesis
Two-dimensional retiming with low memory requirements

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 06

Task-level timing models for guaranteed performance in multiprocessor networks-on-chip

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Minimising buffer requirements of synchronous dataflow graphs with model checking

Proceedings of the 42nd annual Design Automation Conference
Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs

Proceedings of the 43rd annual Design Automation Conference
Efficient computation of buffer capacities for multi-rate real-time systems with back-pressure

CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Online resource management in a multiprocessor with a network-on-chip

Proceedings of the 2007 ACM symposium on Applied computing
Efficient computation of buffer capacities for cyclo-static dataflow graphs

Proceedings of the 44th annual Design Automation Conference
Multithreaded simulation for synchronous dataflow graphs

Proceedings of the 45th annual Design Automation Conference
Energy efficient streaming applications with guaranteed throughput on MPSoCs

EMSOFT '08 Proceedings of the 8th ACM international conference on Embedded software
Scheduling optimisations for SPIN to minimise buffer requirements in synchronous data flow

Proceedings of the 2008 International Conference on Formal Methods in Computer-Aided Design
Buffer sharing in CSP-like programs

MEMOCODE'09 Proceedings of the 7th IEEE/ACM international conference on Formal Methods and Models for Codesign
Bandwidth Allocation for Iterative Data-Dependent E-science Applications

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Buffer minimization of real-time streaming applications scheduling on hybrid CPU/FPGA architectures

Proceedings of the Conference on Design, Automation and Test in Europe
Buffer sharing in rendezvous programs

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems - Special section on the ACM IEEE international conference on formal methods and models for codesign (MEMOCODE) 2009
Constrained global scheduling of streaming applications on MPSoCs

Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Multithreaded Simulation for Synchronous Dataflow Graphs

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Performance Analysis of Reconfigurations in Adaptive Real-Time Streaming Applications

ACM Transactions on Embedded Computing Systems (TECS)
Complexity results for Weighted Timed Event Graphs

Discrete Optimization
Orchestrating stream graphs using model checking

ACM Transactions on Architecture and Code Optimization (TACO)
Combining computation and communication optimizations in system synthesis for streaming applications

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large-grain synchronous dataflow graphs or multi-rate graphs have the distinct feature that the nodes of the dataflow graph fire at different rates. Such multi-rate large-grain dataflow graphs have been widely regarded as a powerful programming model for DSP applications. In this paper we propose a method to minimize buffer storage requirement in constructing rate-optimal compile-time (MBRO) schedules for multi-rate dataflow graphs. We demonstrate that the constraints to minimize buffer storage while executing at the optimal computation rate (i.e. the maximum possible computation rate without storage constraints) can be formulated as a unified linear programming problem in our framework. A novel feature of our method is that in constructing the rate-optimal schedule, it directly minimizes the memory requirement by choosing the schedule time of nodes appropriately. Lastly, a new circular-arc interval graph coloring algorithm has been proposed to further reduce the memory requirement by allowing buffer sharing among the arcs of the multi-rate dataflow graph.We have constructed an experimental testbed which implements our MBRO scheduling algorithm as well as (i) the widely used periodic admissible parallel schedules (also known as block schedules) proposed by Lee and Messerschmitt (IEEE Transactions on Computers, vol. 36, no. 1, 1987, pp. 24–35), (ii) the optimal scheduling buffer allocation (OSBA) algorithm of Ning and Gao (Conference Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, Jan. 10–13, 1993, pp. 29–42), and (iii) the multi-rate software pipelining (MRSP) algorithm (Govindarajan and Gao, in Proceedings of the 1993 International Conference on Application Specific Array Processors, Venice, Italy, Oct. 25–27, 1993, pp. 77–88). Schedules generated for a number of random dataflow graphs and for a set of DSP application programs using the different scheduling methods are compared. The experimental results have demonstrated a significant improvement (10–20%) in buffer requirements for the MBRO schedules compared to the schedules generated by the other three methods, without sacrificing the computation rate. The MBRO method also gives a 20% average improvement in computation rate compared to Lee's Block scheduling method.