Scatter-Add in Data Parallel Architectures

  • Authors:
  • Jung Ho Ahn;Mattan Erez;William J. Dally

  • Affiliations:
  • Stanford University, CA;Stanford University, CA;Stanford University, CA

  • Venue:
  • HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many important applications exhibit large amounts of data parallelism, and modern computer systems are designed to take advantage of it. While much of the computation in the multimedia and scientific application domains is data parallel, certain operations require costly serialization that increase the run time. Examples include superposition type updates in scientific computing and histogram computations in media processing. We introduce scatter-add, which is the data-parallel form of the well-known scalar fetch-and-op, specifically tuned for SIMD/vector/stream style memory systems. The scatter-add mechanism scatters a set of data values to a set of memory addresses and adds each data value to each referenced memory location instead of overwriting it. This novel architecture extension allows us to efficiently support data-parallel atomic update computations found in parallel programming languages such as HPF, and applies both to single-processor and multi-processor SIMD data-parallel systems. We detail the micro-architecture of a scatter-add implementation on a stream architecture, which requires less than 2% increase in die area yet shows performance speedups ranging from 1.45 to over 11 on a set of applications that require a scatter-add computation.