Atomic-free irregular computations on GPUs

  • Authors:
  • Rupesh Nasre;Martin Burtscher;Keshav Pingali

  • Affiliations:
  • The University of Texas Austin;Texas State University San Marcos;The University of Texas Austin

  • Venue:
  • Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Atomic instructions are a key ingredient of codes that operate on irregular data structures like trees and graphs. It is well known that atomics can be expensive, especially on massively parallel GPUs, and are often on the critical path of a program. In this paper, we present two high-level methods to eliminate atomics in irregular programs. The first method advocates synchronous processing using barriers. We illustrate how to exploit synchronous processing to avoid atomics even when the threads' memory accesses conflict with each other. The second method is based on exploiting algebraic properties of algorithms to elide atomics. Specifically, we focus on three key properties: monotonicity, idempotency and associativity, and show how each of them enables an atomic-free implementation. We illustrate the generality of the two methods by applying them to five irregular graph applications: breadth-first search, single-source shortest paths computation, Delaunay mesh refinement, pointer analysis and survey propagation, and show that both methods provide substantial speedup in each case on different GPUs.