Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays
IEEE Transactions on Computers
VLSI array processors
Complexity of Matrix Product on a Class of Orthogonally Connected Systolic Arrays
IEEE Transactions on Computers
A design methodology for synthesizing parallel algorithms and architectures
Journal of Parallel and Distributed Computing
Systematic design approaches for algorithmically specified systolic arrays
Computer architecture
Automating the design of systolic arrays
Integration, the VLSI Journal
Partitioning of processor arrays: a piecewise regular approach
Integration, the VLSI Journal - Special issue on algorithms and architectures
A notational approach to formulation of systolic array programs
Parallel Computing
The Organization of Computations for Uniform Recurrence Equations
Journal of the ACM (JACM)
Parallel Processing: From Applications to Systems
Parallel Processing: From Applications to Systems
A Family of New Efficient Arrays for Matrix Multiplication
IEEE Transactions on Computers
On Uniformization of Affine Dependence Algorithms
IEEE Transactions on Computers
On Time Mapping of Uniform Dependence Algorithms into Lower Dimensional Processor Arrays
IEEE Transactions on Parallel and Distributed Systems
The Journal of Supercomputing
Computing transitive closure problem on linear systolic array
NAA'04 Proceedings of the Third international conference on Numerical Analysis and its Applications
A class of fault-tolerant systolic arrays for matrix multiplication
Mathematical and Computer Modelling: An International Journal
Hi-index | 0.00 |
We consider the problem of matrix multiplication on hexagonal systolic arrays (SA). We begin with the description of the procedure for systolic array designing which is based on data dependency and space-time mapping of the nested loop algorithms. Then we introduce some performance measures which are used throughout the chapter for comparison of various SAs. We proceed with modification of the standard design procedure which enables synthesis of systolic arrays with the optimal number of processing elements (PE) for a given problem size and minimal execution time for a given number of PEs. Then we analyse and compare different hexagonal arrays. Further, we show how execution time of matrix multiplication algorithm can be reduced if the number of PEs is increased with respect to the optimal one. Finally, we address the problem of fault-tolerant matrix multiplication on hexagonal arrays.