Matrix Multiplication Performance on Commodity Shared-Memory Multiprocessors

  • Authors:
  • G. Tsilikas;M. Fleury

  • Affiliations:
  • University of Essex, UK;University of Essex, UK

  • Venue:
  • PARELEC '04 Proceedings of the international conference on Parallel Computing in Electrical Engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cache-oblivious algorithms for matrix multiplication are confirmed as an effective way of exploiting Intel architecture shared-memory multiprocessors. The performance also remains consistent across a wide range of matrix size. The Cilk programming environment remains an effective way of implementing this type of algorithm, but the need for portability and a compiler upgrade route mean that a portability library is a competitive alternative. The paper considers the interaction of matrix multiplication algorithms with the memory hierarchy, as well as multithreading across differing operating system variants and compilers.