Iteration Disambiguation for Parallelism Identification in Time-Sliced Applications

  • Authors:
  • Shane Ryoo;Christopher I. Rodrigues;Wen-Mei W. Hwu

  • Affiliations:
  • Center for Reliable and High-Performance Computing and Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign,;Center for Reliable and High-Performance Computing and Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign,;Center for Reliable and High-Performance Computing and Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign,

  • Venue:
  • Languages and Compilers for Parallel Computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Media and scientific simulation applications have a large amount of parallelism that can be exploited in contemporary multi-core microprocessors. However, traditional pointer and array analysis techniques often fall short in automatically identifying this parallelism. This is due to the allocation and referencing patterns of time-slicing algorithms, where information flows from one time slice to the next. In these, an object is allocated within a loop and written to, with source data obtained from objects created in previous iterations of the loop. The objects are typically allocated at the same static call site through the same call chain in the call graph, making them indistinguishable by traditional heap-sensitive analysis techniques that use call chains to distinguish heap objects. As a result, the compiler cannot separate the source and destination objects within each time slice of the algorithm. In this work we discuss an analysis that quickly identifies these objects through a partially flow-sensitive technique called iteration disambiguation. This is done through a relatively simple aging mechanism. We show that this analysis can distinguish objects allocated in different time slices across a wide range of benchmark applications within tens of seconds even for complete media applications. We will also discuss the obstacles to automatically identifying the remaining parallelism in studied applications and propose methods to address them.