Dependence-based code generation for a CELL processor

  • Authors:
  • Yuan Zhao;Ken Kennedy

  • Affiliations:
  • Computer Science Department, Rice University, Houston , TX;Computer Science Department, Rice University, Houston , TX

  • Venue:
  • LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Obtaining high performance on the STI CELL processor requires substantial programming effort because its architectural features must be explicitly managed, with separate codes required for two different types of cores (PPE and SPE). Research at IBM has developed a single source-image compiler for CELL that performs vectorization but uses OpenMP to specify cross-core parallelism. In this paper, we present and evaluate an alternative dependence-based compiler approach that automatically generates parallel and vector code for CELL from a single source program with no parallelism directives. In contrast to OpenMP, our approach can also handle loop nests that carry dependences. To preserve correct program semantics, we employ on-chip communication mechanisms to implement barrier and unidirectional synchronization primitives. We also implement strategies to boost performance by managing DMA data movement, improving data alignment, and exploiting memory reuse in the innermost loop.