VLSI array processors
Introduction to algorithms
Storage assignment to decrease code size
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
DELAY: an efficient tool for retiming with realistic delay modeling
DAC '95 Proceedings of the 32nd annual ACM/IEEE Design Automation Conference
The practical application of retiming to the design of high-performance systems
ICCAD '93 Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design
Computing strictly-second shortest paths
Information Processing Letters
Fast Algorithms for Digital Signal Processing
Fast Algorithms for Digital Signal Processing
Modeling of Block-Based DSP Systems
Journal of VLSI Signal Processing Systems
Beyond single-appearance schedules: Efficient DSP software synthesis using nested procedure calls
ACM Transactions on Embedded Computing Systems (TECS) - SPECIAL ISSUE SCOPES 2005
Memory-constrained block processing for DSP software optimization
Journal of Signal Processing Systems - Special Issue: Embedded computing systems for DSP
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Multithreaded Simulation for Synchronous Dataflow Graphs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
The next-to-shortest path in undirected graphs with nonnegative weights
CATS '12 Proceedings of the Eighteenth Computing: The Australasian Theory Symposium - Volume 128
Hi-index | 0.00 |
Block-processing can decrease the time and power required to perform any given computation by simultaneously processing multiple samples of input data. The effectiveness of block-processing can be severely limited, however, if the delays in the dataflow graph of the computation are placed suboptimally. In this paper we investigate the application of retiming for improving the effectiveness of block-processing in computations. In particular, we consider the k-delay problem: Given a computation dataflow graph and a positive integer k, we wish to compute a retimed computation graph in which the original delays have been relocated so that k data samples can be processed simultaneously and fully regularly. We give an exact integer linear programming formulation for the k-delay problem. We also describe an algorithm that solves the k-delay problem fast in practice by relying on a set of necessary conditions to prune the search space. Experimental results with synthetic and random benchmarks demonstrate the performance improvements achievable by block-processing and the efficiency of our algorithm.