Regular interactive algorithms and their implementations on processor arrays
Regular interactive algorithms and their implementations on processor arrays
Theory of linear and integer programming
Theory of linear and integer programming
On synthesizing systolic arrays from recurrence equations with linear dependencies
Proc. of the sixth conference on Foundations of software technology and theoretical computer science
Systolic array synthesis: computability and time cones
Proceedings of the international workshop on Parallel algorithms & architectures
The systematic design of systolic arrays
Centre National de Recherche Scientifique on Automata networks in computer science: theory and applications
Some efficient solutions to the affine scheduling problem: I. One-dimensional time
International Journal of Parallel Programming
ICS '94 Proceedings of the 8th international conference on Supercomputing
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
The parallel execution of DO loops
Communications of the ACM
Scheduling and Automatic Parallelization
Scheduling and Automatic Parallelization
Derivation, extensions and parallel implementation of regular iterative algorithms
Derivation, extensions and parallel implementation of regular iterative algorithms
On control signals for multi-dimensional time
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Hi-index | 0.00 |
Many computations can be modeled with systems of affine recurrence equations (SAREs) over polyhedral domains. We study the problem of scheduling individual computations of an SARE in the presence of reductions i.e., operations specifying the accumulation of a set of values to produce a single value. Reductions involve a commutative and associative operator and therefore, per se, do not impose any specific order. However, on realistic machines, operators have bounded fan-in and therefore an order of accumulation (serialization) is needed. Arbitrary serializations may adversely affect the running time of a program. We develop an algorithm to determine efficient serializations of all reductions. We illustrate our methods with two significant examples.