SIAM Journal on Computing
Towards an architecture-independent analysis of parallel algorithms
SIAM Journal on Computing
An optimal solution for Gauss-Jordan elimination of 2D systolic arrays
Systolic array processors
A spacetime-minimal systolic array for matrix product
Systolic array processors
The Area-Time Complexity of Binary Multiplication
Journal of the ACM (JACM)
A Processor-Time-Minimal Systolic Array for Cubical Mesh Algorithms
IEEE Transactions on Parallel and Distributed Systems
A Processor-Time-Minimal Systolic Array for Transitive Closure
IEEE Transactions on Parallel and Distributed Systems
STOC '79 Proceedings of the eleventh annual ACM symposium on Theory of computing
An introduction to processor-time-optimal systolic arrays
Highly parallel computaions
Processor Lower Bound Formulas for Array Computations and Parametric Diophantine Systems
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Hi-index | 0.00 |
Using a directed acyclic graph (dag) model of algorithms, we investigateprecedence-constrained multiprocessor schedules for the n/spl times/n/spl times/ndirected mesh. This cubical mesh is fundamental, representing the standard algorithm forsquare matrix product, as well as many other algorithms. Its completion requires at least3/sup n/spl minus/2/ multiprocessor steps. Time-minimal multiprocessor schedules thatuse as few processors as possible are called processor-time-minimal. For the cubicalmesh, such a schedule requires at least /spl lsqb/3n/sup 2//4/spl rsqb/ processors.Among such schedules, one with the minimum period (i.e., maximum throughput) isreferred to as a period-processor-time-minimal schedule. The period of anyprocessor-time-minimal schedule for the cubical mesh is at least 3/sup n/2/ steps. Thislower bound is shown to be exact by constructing, for n a multiple of 6, aperiod-processor-time-minimal multiprocessor schedule that can be realized on a systolicarray whose topology is a toroidally connected n/2/spl times/n/2/spl times/3 mesh.