Optimum Broadcasting and Personalized Communication in Hypercubes
IEEE Transactions on Computers
Fortran at ten gigaflops: the connection machine convolution compiler
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Optimal expression evaluation for data parallel architectures
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Consider an array of Processing Elements [PEs], connected by a 2-dimensional grid network, and holding at most one operand of an expression in each PE. Suppose that each PE is allowed, in any one parallel step, to receive one item of data from any of its four immediate neighbors, and to transmit one datum, as well. How can an associative operator, such as addition, combine all the operands, using as little time for communciation as possible? An expression using such a single operator is termed a uniform expression. When the total number of communication links used is the measure of goodness, this problem becomes a Steiner Tree problem, in the Manhattan Distance metric. When the measure is minimizing the parallel time to completion, a method for solving this problem is given which is optimal to within an additive constant of two time-steps. The method has applications when the operands are matrices, spread over an array of PEs, as well. Some lower bounds for this problem, in more general networks, are also proven.