A Processor-Time-Minimal Systolic Array for Transitive Closure

Authors:
C. J. Scheiman;P. R. Cappello
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1992

Citing 20
Cited 8

Complexity issues in VLSI: optimal layouts for the shuffle-exchange graph and other networks

Complexity issues in VLSI: optimal layouts for the shuffle-exchange graph and other networks
Regular interactive algorithms and their implementations on processor arrays

Regular interactive algorithms and their implementations on processor arrays
Optimal Systolic Design for the Transitive Closure and the Shortest Path Problems

IEEE Transactions on Computers
A communication-time tradeoff

SIAM Journal on Computing
A design methodology for synthesizing parallel algorithms and architectures

Journal of Parallel and Distributed Computing
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms

IEEE Transactions on Computers
A Note on the Linear Transformation Method for Systolic Array Design

IEEE Transactions on Computers
A new systolic architecture for the algebraic path problem

Systolic array processors
A Theorem on Boolean Matrices

Journal of the ACM (JACM)
The Organization of Computations for Uniform Recurrence Equations

Journal of the ACM (JACM)
Algorithm 97: Shortest path

Communications of the ACM
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
The Generation of a Class of Multipliers: Synthesizing Highly Parallel Algorithms in VLSI

IEEE Transactions on Computers
A Processor-Time-Minimal Systolic Array for Transitive Closure

IEEE Transactions on Parallel and Distributed Systems
Systolic Array Synthesis by Static Analysis of Program Dependencies

Proceedings of the Parallel Architectures and Languages Europe, Volume I: Parallel Architectures PARLE
On Synthesizing Systolic Arrays from Recurrence Equations with Linear Dependencies

Proceedings of the Sixth Conference on Foundations of Software Technology and Theoretical Computer Science
Automatic synthesis of systolic arrays from uniform recurrent equations

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Algorithm transformations for parallel processing and vlsi architecture design

Algorithm transformations for parallel processing and vlsi architecture design
Computational Aspects of VLSI

Computational Aspects of VLSI

Scheduling Multiprocessor Tasks with Genetic Algorithms

IEEE Transactions on Parallel and Distributed Systems
A Space-Time Representation Method of Iterative Algorithms for the Design of Processor Arrays

Journal of VLSI Signal Processing Systems
Design of Space-Optimal Regular Arrays for Algorithms with Linear Schedules

IEEE Transactions on Computers
A Processor-Time-Minimal Systolic Array for Transitive Closure

IEEE Transactions on Parallel and Distributed Systems
A Period-Processor-Time-Minimal Schedule for Cubical Mesh Algorithms

IEEE Transactions on Parallel and Distributed Systems
An introduction to processor-time-optimal systolic arrays

Highly parallel computaions
Processor Lower Bound Formulas for Array Computations and Parametric Diophantine Systems

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Using a directed acyclic graph (DAG) model of algorithms, the authors focus on processor-time-minimal multiprocessor schedules: time-minimal multiprocessor schedules that use as few processors as possible. The Kung, Lo, and Lewis (KLL) algorithm for computing the transitive closure of a relation over a set of n elements requires at least 5n-4 parallel steps. As originally reported, their systolic array comprises n/sup 2/ processing elements. It is shown that any time-minimal multiprocessor schedule of the KLL algorithm's dag needs at least n/sup 2//3 processing elements. Then a processor-time-minimal systolic array realizing the KLL dag is constructed. Its processing elements are organized as a cylindrically connected 2-D mesh, when n=0 mod 3. When n not=0 mod 3, the 2-D mesh is connected as a torus.