Automatic Resource Scheduling with Latency Hiding for Parallel Stencil Applications on GPGPU Clusters

Authors:
Kumiko Maeda;Masana Murase;Munehiro Doi;Hideaki Komatsu;Shigeho Noda;Ryutaro Himeno
Affiliations:
-;-;-;-;-;-
Venue:
IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
Year:
2012

Citing 0
Cited 1

VGRIS: virtualized GPU resource isolation and scheduling in cloud gaming

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Overlapping computations and communication is a key to accelerating stencil applications on parallel computers, especially for GPU clusters. However, such programming is a time-consuming part of the stencil application development. To address this problem, we developed an automatic code generation tool to produce a parallel stencil application with latency hiding automatically from its dataflow model. With this tool, users visually construct the workflows of stencil applications in a dataflow programming model. Our dataflow compiler determines a data decomposition policy for each application, and generates source code that overlaps the stencil computations and communication (MPI and PCIe). We demonstrate two types of overlapping models, a CPU-GPU hybrid execution model and a GPU-only model. We use a CFD benchmark computing 19-point 3D stencils to evaluate our scheduling performance, which results in 1.45 TFLOPS in single precision on a cluster with 64 Tesla C1060 GPUs.