Communication-optimal Parallel and Sequential Cholesky Decomposition
SIAM Journal on Scientific Computing
The efficiency of mapreduce in parallel external memory
LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics
GraphChi: large-scale graph computation on just a PC
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Graph expansion and communication costs of fast matrix multiplication
Journal of the ACM (JACM)
An energy complexity model for algorithms
Proceedings of the 4th conference on Innovations in Theoretical Computer Science
Hi-index | 0.00 |
We study the problem of sparse-matrix dense-vector multiplication (SpMV) in external memory. The task of SpMV is to compute y:=Ax, where A is a sparse N×N matrix and x is a vector. We express sparsity by a parameter k, and for each choice of k consider the class of matrices where the number of nonzero entries is kN, i.e., where the average number of nonzero entries per column is k. We investigate what is the external worst-case complexity, i.e., the best possible upper bound on the number of I/Os, as a function of k, N and the parameters M (memory size) and B (track size) of the I/O-model. We determine this complexity up to a constant factor for all meaningful choices of these parameters, as long as k≤N 1−ε , where ε depends on the problem variant. Our model of computation for the lower bound is a combination of the I/O-models of Aggarwal and Vitter, and of Hong and Kung. We study variants of the problem, differing in the memory layout of A. If A is stored in column major layout, we prove that SpMV has I/O complexity $\Theta(\min\{\frac{kN}{B}\max\{1,\log_{M/B}\frac{N}{\max\{k,M\}}\},\,kN\})$for k≤N 1−ε and any constant 0εk≤N/2. In the cache oblivious setting we prove that with tall cache assumption M≥B 1+ε , the I/O complexity is $\mathcal {O}({\frac{kN}{B}\max\{1,\log_{M/B}\frac{N}{\max\{k,M\}}\}})$for A in column major layout.