Cache-sensitive MapReduce DGEMM algorithms for shared memory architectures
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Hi-index | 0.00 |
Cache-oblivious algorithms for matrix multiplication are confirmed as an effective way of exploiting Intel architecture shared-memory multiprocessors. The performance also remains consistent across a wide range of matrix size. The Cilk programming environment remains an effective way of implementing this type of algorithm, but the need for portability and a compiler upgrade route mean that a portability library is a competitive alternative. The paper considers the interaction of matrix multiplication algorithms with the memory hierarchy, as well as multithreading across differing operating system variants and compilers.