A cache oblivious algorithm for matrix multiplication based on peano's space filling curve

Authors:
Michael Bader;Christoph Zenger
Affiliations:
Dept. of Informatics, TU München, München, Germany;Dept. of Informatics, TU München, München, Germany
Venue:
PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Year:
2005

Citing 5
Cited 4

Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Recursion leads to automatic variable blocking for dense linear-algebra algorithms

IBM Journal of Research and Development
Nonlinear array layouts for hierarchical memory systems

ICS '99 Proceedings of the 13th international conference on Supercomputing
Cache-Oblivious Algorithms

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
I/O complexity: The red-blue pebble game

STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing

Analyzing block locality in Morton-order and Morton-hybrid matrices

ACM SIGARCH Computer Architecture News
Cache oblivious matrix operations using Peano curves

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Hardware-oriented implementation of cache oblivious matrix operations based on space-filling curves

PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Cache-oblivious polygon indecomposability testing

Proceedings of the 4th International Workshop on Parallel and Symbolic Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cache oblivious algorithms are algorithms that are designed to inherently exploit any kind of cache memory—regardless of its size or architecture. In this article, we discuss a cache oblivious algorithm for matrix multiplication. The elements of the matrices are stored according to a Peano space filling curve. A block recursive approach then leads to an algorithm where memory access to matrix elements is strictly local. Consequently, the algorithm shows several interesting properties considering cache performance, prefetching strategies, or even parallelization.