Multi-target C++ implementation of parallel skeletons

  • Authors:
  • Wilfried Kirschenmann;Laurent Plagne;Stephane Vialle

  • Affiliations:
  • EDF R&D & AlGorille INRIA, Clamart, France;EDF R&D, Clamart, France;SUPELEC - IMS group & AlGorille INRIA project team, Metz Cedex, France

  • Venue:
  • Proceedings of the 8th workshop on Parallel/High-Performance Object-Oriented Scientific Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents the design of an efficient multi-target (CPU+GPU) implementation for the Parallel_for skeleton. Emerging massively parallel architectures promise very high performances for a low cost. However, these architectures change faster than ever. Thus, optimization of codes becomes a very complex and time consumming task. We have identified the data storage as the main difference between the CPU and the GPU implementation of a code. We introduce an abstract data layout in order to adapt the data storage. Based on this layout, the utilization of Parallel_for skeleton allows to compile and execute the same program both on CPU and on GPU. Once compiled, the program runs close to the hardware limits.