How can we speed up matrix multiplication?
SIAM Review
Extra high speed matrix multiplication on the Cray-2
SIAM Journal on Scientific and Statistical Computing
Matrix multiplication via arithmetic progressions
Journal of Symbolic Computation - Special issue on computational algebraic complexity
Exploiting fast matrix multiplication within the level 3 BLAS
ACM Transactions on Mathematical Software (TOMS)
Stability of block algorithms with fast level-3 BLAS
ACM Transactions on Mathematical Software (TOMS)
GEMMW: a portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm
Journal of Computational Physics
Fast rectangular matrix multiplication and applications
Journal of Complexity
Approximating matrix multiplication for pattern recognition tasks
Journal of Algorithms
Tuning Strassen's matrix multiplication for memory efficiency
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Accuracy and Stability of Numerical Algorithms
Accuracy and Stability of Numerical Algorithms
Introduction To Automata Theory, Languages, And Computation
Introduction To Automata Theory, Languages, And Computation
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
All Pairs Shortest Paths in weighted directed graphs ? exact and almost exact algorithms
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Algorithms for matrix multiplication
Algorithms for matrix multiplication
LAPPACK Working Note No. 28: The IBM RISC System/6000 and Linear Algebra Operations
LAPPACK Working Note No. 28: The IBM RISC System/6000 and Linear Algebra Operations
Adaptive Strassen's matrix multiplication
Proceedings of the 21st annual international conference on Supercomputing
The schur aggregation for solving linear systems of equations
Proceedings of the 2007 international workshop on Symbolic-numeric computation
Null space and eigenspace computations with additive preprocessing
Proceedings of the 2007 international workshop on Symbolic-numeric computation
Solving toeplitz- and vandermonde-like linear systems with large displacement rank
Proceedings of the 2007 international symposium on Symbolic and algebraic computation
Additive preconditioning and aggregation in matrix computations
Computers & Mathematics with Applications
Two Dimensional Aggregation Procedure: An Alternative to the Matrix Algebraic Algorithm
Computational Economics
Products of ordinary differential operators by evaluation and interpolation
Proceedings of the twenty-first international symposium on Symbolic and algebraic computation
Dense Linear Algebra over Word-Size Prime Fields: the FFLAS and FFPACK Packages
ACM Transactions on Mathematical Software (TOMS)
Solving structured linear systems with large displacement rank
Theoretical Computer Science
Adaptive Winograd's matrix multiplications
ACM Transactions on Mathematical Software (TOMS)
Algebraic and numerical algorithms
Algorithms and theory of computation handbook
Optimization techniques for small matrix multiplication
Theoretical Computer Science
Hi-index | 0.00 |
The main purpose of this paper is to present a fast matrix multiplication algorithm taken from the paper of Laderman et al. (Linear Algebra Appl. 162-164 (1992) 557) in a refined compact "analytical" form and to demonstrate that it can be implemented as quite efficient computer code. Our improved presentation enables us to simplify substantially the analysis of the computational complexity and numerical stability of the algorithm as well as its computer implementation. The algorithm multiplies two N × N matrices using O(N2.7760) arithmetic operations. In the case where N = 18 ċ 48k, for a positive integer k, the total number of flops required by the algorithm is 4.894N2.7760 - 16.165N2, which may be compared to a similar estimate for the Winograd algorithm, 3.732N2.8074 - 5N2 flops, N = 8 ċ 2k, the latter being current record bound among all known practical algorithms. Moreover, we present a pseudo-code of the algorithm which demonstrates its very moderate working memory requirements, much smaller than that of the best available implementations of Strassen and Winograd algorithms. For matrices of medium-large size (say, 2000 ≤ N