Speculative parallelization of partial reduction variables

  • Authors:
  • Liang Han;Wei Liu;James M. Tuck

  • Affiliations:
  • North Carolina State University, Raleigh, NC, USA;Intel Corp., Santa Clara, CA, USA;North Carolina State University, Raleigh, NC, USA

  • Venue:
  • Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reduction variables are an important class of cross-thread dependence that can be parallelized by exploiting the associativity and commutativity of their operation. In this paper, we define a class of shared variables called partial reduction variables (PRV). These variables either cannot be proven to be reductions or they violate the requirements of a reduction variable in some way. We describe an algorithm that allows the compiler to detect PRVs, and we also discuss the necessary requirements to parallelize detected PRVs. Based on these requirements, we propose an implementation in a TLS system to parallelize PRVs that works by a combination of techniques at compile time and in the hardware. The compiler transforms the variable under the assumption that the reduction-like behavior proven statically will hold true at runtime. However, if a thread reads or updates the shared variable as a result of an alias or unlikely control path, a lightweight hardware mechanism will detect the access and synchronize it to ensure correct execution. We implement our compiler analysis and transformation in GCC, and analyze its potential on the SPEC CPU 2000 benchmarks.We find that supporting PRVs provides up to 46% performance gain over a highly optimized TLS system and on average 10.7% performance improvement.