Supermatrix out-of-order scheduling of matrix operations for SMP and multi-core architectures

  • Authors:
  • Ernie Chan;Enrique S. Quintana-Orti;Gregorio Quintana-Orti;Robert van de Geijn

  • Affiliations:
  • The University of Texas at Austin, Austin, TX;Universidad Jaume I, Castellon, Spain;Universidad Jaume I, Castellon, Spain;The University of Texas at Austin, Austin, TX

  • Venue:
  • Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We discuss the high-performance parallel implementation and execution of dense linear algebra matrix operations on SMP architectures, with an eye towards multi-core processors with many cores. We argue that traditional implementations, as those incorporated in LAPACK, cannot be easily modified to render high performance as well as scalability on these architectures. The solution we propose is to arrange the data structures and algorithms so that matrix blocks become the fundamental units of data, and operations on these blocks become the fundamental units of computation, resulting in algorithms-by-blocks as opposed to the more traditional blocked algorithms. We show that this facilitates the adoption of techniques akin to dynamic scheduling and out-of-order execution usual in superscalar processors, which we name SuperMatrix Out-of-Order scheduling. Performance results on a 16 CPU Itanium2-based server are used to highlight opportunities and issues related to this new approach.