Matrix multiplication via arithmetic progressions
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Extra high speed matrix multiplication on the Cray-2
SIAM Journal on Scientific and Statistical Computing
The input/output complexity of sorting and related problems
Communications of the ACM
Matrix multiplication via arithmetic progressions
Journal of Symbolic Computation - Special issue on computational algebraic complexity
GEMMW: a portable level 3 BLAS Winograd variant of Strassen's matrix-matrix multiply algorithm
Journal of Computational Physics
Rectangular matrix multiplication revisited
Journal of Complexity
Locality of Reference in LU Decomposition with Partial Pivoting
SIAM Journal on Matrix Analysis and Applications
Implementation of Strassen's algorithm for matrix multiplication
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Automatic Generation of Block-Recursive Codes
Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
On the Space and Access Complexity of Computation DAGs
WG '00 Proceedings of the 26th International Workshop on Graph-Theoretic Concepts in Computer Science
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
I/O complexity: The red-blue pebble game
STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Space-Time Tradeoffs in Memory Hierarchies
Space-Time Tradeoffs in Memory Hierarchies
Optimizing Graph Algorithms for Improved Cache Performance
IPDPS '02 Proceedings of the 16th International Symposium on Parallel and Distributed Processing
A cellular computer to implement the kalman filter algorithm
A cellular computer to implement the kalman filter algorithm
On the Complexity of Matrix Product
SIAM Journal on Computing
Communication lower bounds for distributed-memory matrix multiplication
Journal of Parallel and Distributed Computing
Concurrency and Computation: Practice & Experience
Group-theoretic Algorithms for Matrix Multiplication
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Cache-oblivious dynamic programming
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Optimal sparse matrix dense vector multiplication in the I/O-model
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Provably good multicore cache performance for divide-and-conquer algorithms
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
An elementary construction of constant-degree expanders
Combinatorics, Probability and Computing
Conductance and convergence of Markov chains-a combinatorial treatment of expanders
SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science
Benchmarking GPUs to tune dense linear algebra
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Algebraic Complexity Theory
Brief announcement: communication bounds for heterogeneous architectures
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Communication-optimal Parallel and Sequential Cholesky Decomposition
SIAM Journal on Scientific Computing
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Communication-optimal parallel algorithm for strassen's matrix multiplication
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
A framework for low-communication 1-D FFT
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Communication-avoiding parallel strassen: implementation and performance
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Graph expansion and communication costs of fast matrix multiplication
Journal of the ACM (JACM)
A lower bound technique for communication on BSP with application to the FFT
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Beyond reuse distance analysis: Dynamic analysis for characterization of data locality potential
ACM Transactions on Architecture and Code Optimization (TACO)
Communication costs of Strassen's matrix multiplication
Communications of the ACM
A framework for low-communication 1-D FFT
Scientific Programming - Selected Papers from Super Computing 2012
Hi-index | 0.02 |
The communication cost of algorithms (also known as I/O-complexity) is shown to be closely related to the expansion properties of the corresponding computation graphs. We demonstrate this on Strassen's and other fast matrix multiplication algorithms, and obtain the first lower bounds on their communication costs. For sequential algorithms these bounds are attainable and so optimal.