A three-dimensional approach to parallel matrix multiplication
IBM Journal of Research and Development
A Fast Algorithm for Matrix Multiplication and Its Efficient Realization on Systolic Arrays
Cybernetics and Systems Analysis
Multilevel hierarchical matrix multiplication on clusters
Proceedings of the 18th annual international conference on Supercomputing
Combining building blocks for parallel multi-level matrix multiplication
Parallel Computing
Adaptive Winograd's matrix multiplications
ACM Transactions on Mathematical Software (TOMS)
The Journal of Supercomputing
Hi-index | 0.01 |
In this paper, we give what we believe to be the first high performance parallel implementation of Strassen''s algorithm for matrix multiplication. We show how under restricted conditions, this algorithm can be implemented plug compatible with standard parallel matrix multiplication algorithms. Results obtained on a large Intel Paragon system show a 10-20% reduction in execution time compared to what we believe to be the fastest standard parallel matrix multiplication implementation available at this time.