A compound OpenMP/MPI program development toolkit for hybrid CPU/GPU clusters

Authors:
Hung-Fu Li;Tyng-Yeu Liang;Jun-Yao Chiu
Affiliations:
Department of Electrical Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan, R.O.C.;Department of Electrical Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan, R.O.C.;Department of Electrical Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan, R.O.C.
Venue:
The Journal of Supercomputing
Year:
2013

Citing 27
Cited 0

TreadMarks: Shared Memory Computing on Networks of Workstations

Computer
A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing
Design of a separable transition-diagram compiler

Communications of the ACM
Automatically tuned collective communications

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
The Omni OpenMP Compiler on the Distributed Shared Memory of Cenju-4

WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
The Cg Tutorial: The Definitive Guide to Programmable Real-Time Graphics

The Cg Tutorial: The Definitive Guide to Programmable Real-Time Graphics
Dynamic Load Balancing of MPI+OpenMP Applications

ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
Optimization of MPI collective communication on BlueGene/L systems

Proceedings of the 19th annual international conference on Supercomputing
Live migration of virtual machines

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
MapReduce: simplified data processing on large clusters

Communications of the ACM - 50th anniversary issue: 1958 - 2008
Dynamic Load Balancing on Dedicated Heterogeneous Systems

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Mars: a MapReduce framework on graphics processors

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
OpenMP to GPGPU: a compiler framework for automatic translation and optimization

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Towards OpenMP Execution on Software Distributed Shared Memory Systems

ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Dynamic Load Balancing Algorithm for MPI Parallel Computing

NISS '09 Proceedings of the 2009 International Conference on New Trends in Information and Service Science
A Stream Processor Cluster Architecture Model with the Hybrid Technology of MPI and CUDA

ICISE '09 Proceedings of the 2009 First IEEE International Conference on Information Science and Engineering
High-Performance Cloud Computing: A View of Scientific Applications

ISPAN '09 Proceedings of the 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks
JCudaMP: OpenMP/Java on CUDA

Proceedings of the 3rd International Workshop on Multicore Software Engineering
OpenMPC: Extended OpenMP Programming and Tuning for GPUs

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
hiCUDA: High-Level GPGPU Programming

IEEE Transactions on Parallel and Distributed Systems
Source-to-Source Code Translator: OpenMP C to CUDA

HPCC '11 Proceedings of the 2011 IEEE International Conference on High Performance Computing and Communications
An OpenMP Compiler for Hybrid CPU/GPU Computing Architecture

INCOS '11 Proceedings of the 2011 Third International Conference on Intelligent Networking and Collaborative Systems
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems

Computing in Science and Engineering
A CUDA programming toolkit on grids

International Journal of Grid and Utility Computing
Enabling Mixed OpenMP/MPI Programming on Hybrid CPU/GPU Computing Architecture

IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
Task Scheduling for GPU Heterogeneous Cluster

CLUSTERW '12 Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a program development toolkit called OMPICUDA for hybrid CPU/GPU clusters. With the support of this toolkit, users can make use of a familiar programming model, i.e., compound OpenMP and MPI instead of mixed CUDA and MPI or SDSM to develop their applications on a hybrid CPU/GPU cluster. In addition, they can adapt the types of resources used for executing different parallel regions in the same program by means of an extended device directive according to the property of each parallel region. On the other hand, this programming toolkit supports a set of data-partition interfaces for users to achieve load balance at the application level no matter what type of resources are used for the execution of their programs.