Tile Percolation: An OpenMP Tile Aware Parallelization Technique for the Cyclops-64 Multicore Processor

  • Authors:
  • Ge Gan;Xu Wang;Joseph Manzano;Guang R. Gao

  • Affiliations:
  • Dep. of Electrical and Computer Engineering, University of Delaware Newark, Delaware, U.S.A. 19716;Dep. of Electrical Engineering, Jilin University, Jilin, China 130000;Dep. of Electrical and Computer Engineering, University of Delaware Newark, Delaware, U.S.A. 19716;Dep. of Electrical and Computer Engineering, University of Delaware Newark, Delaware, U.S.A. 19716

  • Venue:
  • Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Programming a multicore processor is difficult. It is even more difficult if the processor has software-managed memory hierarchy, e.g. the IBM Cyclops-64 (C64). A widely accepted parallel programming solution for multicore processor is OpenMP. Currently, all OpenMP directives are only used to decompose computation code (such as loop iterations, tasks, code sections, etc.). None of them can be used to control data movement, which is crucial for the C64 performance. In this paper, we propose a technique called tile percolation . This method provides the programmer with a set of OpenMP pragma directives. The programmer can use these directives to annotate their program to specify where and how to perform data movement. The compiler will then generate the required code accordingly. Our method is a semi-automatic code generation approach intended to simplify a programmer's work. The paper provides (a) an exploration of the possibility of developing pragma directives for semi-automatic data movement code generation in OpenMP; (b) an introduction of techniques used to implement tile percolation including the programming API, the code generation in compiler, and the required runtime support routines; (c) and an evaluation of tile percolation with a set of benchmarks. Our experimental results show that tile percolation can make the OpenMP programs run on the C64 chip more efficiently.