A three-dimensional approach to parallel matrix multiplication
IBM Journal of Research and Development
IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Parallel Computation: MM +/- X
Informatics - 10 Years Back. 10 Years Ahead.
Disk Resident Arrays: An Array-Oriented I/O Library for Out-Of-Core Computations
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
A New Parallel Matrix Multiplication Algorithm on Distributed-Memory Concurrent Computers
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Programming for parallelism and locality with hierarchically tiled arrays
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Performance without pain = productivity: data layout and collective communication in UPC
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Computationally efficient parallel matrix-matrix multiplication on the torus
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Dimension reduction and visualization of large high-dimensional data via interpolation
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
The general matrix multiply-add operation on 2D torus
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
The Journal of Supercomputing
Hi-index | 0.00 |
In this paper, we give a straight forward, highly efficient, scalable implementation of common matrix multiplication operations. The algorithms are much simpler than previously published methods, yield better performance, and require less work space. MPI implementations are given, as are performance results on the Intel Paragon system.