Efficient implementation of data flow graphs on multi-gpu clusters

  • Authors:
  • Vincent Boulos;Sylvain Huet;Vincent Fristot;Luc Salvo;Dominique Houzet

  • Affiliations:
  • GIPSA-Lab, Image-Signal Department, CNRS UMR 5216, University of Grenoble, Saint Martin d'Heres, France 38402;GIPSA-Lab, Image-Signal Department, CNRS UMR 5216, University of Grenoble, Saint Martin d'Heres, France 38402;GIPSA-Lab, Image-Signal Department, CNRS UMR 5216, University of Grenoble, Saint Martin d'Heres, France 38402;SIMAP, GPM2 Group, CNRS UMR 5266, University of Grenoble, Saint Martin d'Heres, France 38402;GIPSA-Lab, Image-Signal Department, CNRS UMR 5216, University of Grenoble, Saint Martin d'Heres, France 38402

  • Venue:
  • Journal of Real-Time Image Processing
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, it is possible to build a multi-GPU supercomputer, well suited for implementation of digital signal processing algorithms, for a few thousand dollars. However, to achieve the highest performance with this kind of architecture, the programmer has to focus on inter-processor communications, tasks synchronization. In this paper, we propose a high level programming model based on a data flow graph (DFG) allowing an efficient implementation of digital signal processing applications on a multi-GPU computer cluster. This DFG-based design flow abstracts the underlying architecture. We focus particularly on the efficient implementation of communications by automating computation---communication overlap, which can lead to significant speedups as shown in the presented benchmark. The approach is validated on three experiments: a multi-host multi-gpu benchmark, a 3D granulometry application developed for research on materials and an application for computing visual saliency maps.