Regular interactive algorithms and their implementations on processor arrays
Regular interactive algorithms and their implementations on processor arrays
Complexity of Matrix Product on a Class of Orthogonally Connected Systolic Arrays
IEEE Transactions on Computers
SIAM Journal on Computing
A design methodology for synthesizing parallel algorithms and architectures
Journal of Parallel and Distributed Computing
The systematic design of systolic arrays
Centre National de Recherche Scientifique on Automata networks in computer science: theory and applications
Synthesizing Linear Array Algorithms from Nested FOR Loop Algorithms
IEEE Transactions on Computers
Systematic design approaches for algorithmically specified systolic arrays
Computer architecture
An optimal solution for Gauss-Jordan elimination of 2D systolic arrays
Systolic array processors
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
Systolic Signal Processing Systems
Systolic Signal Processing Systems
Introduction to VLSI Systems
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
The Generation of a Class of Multipliers: Synthesizing Highly Parallel Algorithms in VLSI
IEEE Transactions on Computers
Systolic Array Synthesis by Static Analysis of Program Dependencies
Proceedings of the Parallel Architectures and Languages Europe, Volume I: Parallel Architectures PARLE
On Synthesizing Systolic Arrays from Recurrence Equations with Linear Dependencies
Proceedings of the Sixth Conference on Foundations of Software Technology and Theoretical Computer Science
Automatic synthesis of systolic arrays from uniform recurrent equations
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Algorithm transformations for parallel processing and vlsi architecture design
Algorithm transformations for parallel processing and vlsi architecture design
Design of Space-Optimal Regular Arrays for Algorithms with Linear Schedules
IEEE Transactions on Computers
A Period-Processor-Time-Minimal Schedule for Cubical Mesh Algorithms
IEEE Transactions on Parallel and Distributed Systems
An introduction to processor-time-optimal systolic arrays
Highly parallel computaions
Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays
Journal of Parallel and Distributed Computing
Complexity of matrix product on modular linear systolic arrays for algorithms with affine schedules
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
Computationally efficient parallel matrix-matrix multiplication on the torus
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Hi-index | 0.00 |
Using a directed acyclic graph (DAG) model of algorithms, the paper focuses ontime-minimal multiprocessor schedules that use as few processors as possible. Such a processor-time-minimal scheduling of an algorithm's DAG first is illustrated using a triangular shaped 2-D directed mesh (representing, for example, an algorithm for solving a triangular system of linear equations). Then, algorithms represented by an n*n*n directed mesh are investigated. This cubical directed mesh is fundamental; it represents the standard algorithm for computing matrix product as well as many other algorithms. Completion of the cubical mesh required 3n-2 steps. It is shown that the number of processing elements needed to achieve this time bound is at least (3n/sup 2/4/). Asystolic array for the cubical directed mesh is then presented. It completes the mesh using the minimum number of steps and exactly (3n/sup 2/4/) processing elements it is processor-time-minimal. The systolic array's topology is that of a hexagonally shaped, cylindrically connected, 2-D directed mesh.