Executing irregular scientific applications on stream architectures

  • Authors:
  • Mattan Erez;Jung Ho Ahn;Jayanth Gummaraju;Mendel Rosenblum;William J. Dally

  • Affiliations:
  • The University of Texas at Austin;Hewlett-Packard Laboratories;Stanford University;Stanford University;Stanford University

  • Venue:
  • Proceedings of the 21st annual international conference on Supercomputing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The recent emergence of compute-intensive stream processors such as the Cell Broadband Engine, Stanford's Merrimac, and Clear-Speed's CSX600 has made them attractive platforms for scientific high-performance computing. Unstructured mesh and graph applications are an important class of numerical algorithms used in the scientific computing domain, which are particularly challenging for stream architectures. These codes have irregular structures where nodes have a variable number of neighbors, resulting in irregular memory access patterns and irregular control. We study four representative sub-classes of irregular algorithms, including finite-element and finite-volume methods for modeling physical systems, direct methods for n-body problems, and computations involving sparse algebra. We propose a framework for representing the diverse characteristics of these algorithms in the context of the unique properties of stream architectures, and demonstrate it using one representative application from each sub-class. We then develop techniques for mapping the applications onto a stream processor, placing emphasis on data-localization and parallelizations. Our simulations show that efficient stream hardware with restricted control abilities can effectively run challenging irregular applications with, for example, a finite element method and a molecular dynamic code sustaining 69GFLOP/s and 46GFLOP/s (64-bit) respectively using a single chip that measures 12mm on a side and consumes less than 70W in 90nm technology.