On synthesizing systolic arrays from recurrence equations with linear dependencies
Proc. of the sixth conference on Foundations of software technology and theoretical computer science
A practical algorithm for exact array dependence analysis
Communications of the ACM
Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
An affine partitioning algorithm to maximize parallelism and minimize communication
ICS '99 Proceedings of the 13th international conference on Supercomputing
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
The parallel execution of DO loops
Communications of the ACM
Scheduling and Automatic Parallelization
Scheduling and Automatic Parallelization
Scheduling reductions on realistic machines
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Structure of Computers and Computations
Structure of Computers and Computations
Quadratic Control Signals in Linear Systolic Arrays
ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
High-Level Synthesis of Nonprogrammable Hardware Accelerators
ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
Scanning Polyhedra without Do-loops
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Hi-index | 0.00 |
Affine control loops (ACLs) comprise an important class of compute- and data-intensive computations. The theoretical framework for the automatic parallelization of ACLs is well established. However, the hardware compilation of arbitrary ACLs is still in its infancy. An important component for an efficient hardware implementation is a control mechanism that informs each processing element (PE) which computation needs to be performed and when. We formulate this control signal problem in the context of compiling arbitrary ACLs parallelized with a multi-dimensional schedule into hardware. We characterize the logical time instants when PEs need a control signal indicating which particular computations need to be performed. Finally, we present an algorithm to compute the minimal set of logical time instants for these control signals.