CPU/GPU computing for long-wave radiation physics on large GPU clusters

Authors:
Fengshun Lu;Junqiang Song;Xiaoqun Cao;Xiaoqian Zhu
Affiliations:
College of Computer, National University of Defense Technology, 410073 Changsha, Hunan, China;College of Computer, National University of Defense Technology, 410073 Changsha, Hunan, China;College of Computer, National University of Defense Technology, 410073 Changsha, Hunan, China;College of Computer, National University of Defense Technology, 410073 Changsha, Hunan, China
Venue:
Computers & Geosciences
Year:
2012

Citing 12
Cited 1

A performance study of general-purpose applications on graphics processors using CUDA

Journal of Parallel and Distributed Computing
Towards Accelerated Computation of Atmospheric Equations Using CUDA

UKSIM '09 Proceedings of the UKSim 2009: 11th International Conference on Computer Modelling and Simulation
Accelerating geoscience and engineering system simulations on graphics hardware

Computers & Geosciences
Large-scale FFT on GPU clusters

Proceedings of the 24th ACM International Conference on Supercomputing
GPU Computing for Atmospheric Modeling

Computing in Science and Engineering
An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Scaling Hierarchical N-body Simulations on GPU Clusters

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Optimal Utilization of Heterogeneous Resources for Biomolecular Simulations

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Accelerating S3D: a GPGPU case study

Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Hybrid Core Acceleration of UWB SIRE Radar Signal Processing

IEEE Transactions on Parallel and Distributed Systems
Reducing branch divergence in GPU programs

Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
Load Balancing versus Occupancy Maximization on Graphics Processing Units: The Generalized Hough Transform as a Case Study

International Journal of High Performance Computing Applications

Implementing an affordable high-performance computing for teaching-oriented computer science curriculum

ACM Transactions on Computing Education (TOCE)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Geoscience simulations rely heavily on high performance computing (HPC) systems. To date, many CPU/GPU heterogeneous HPC systems have been established on which many geoscience simulations have been performed. For most of these simulations on GPU clusters, it can be observed that only the GPU's computational capacity has been exploited to accomplish the arithmetic operations while that of the CPU is ignored, which results in an underutilization of the computing resources within the entire HPC system. In this paper, we perform a long-wave radiation simulation by exploiting the computational capacities of both CPUs and GPUs in the Tianhe-1A supercomputer. First, the long-wave radiation process is accelerated with a Tesla M2050GPU and achieves significant speedup over the baseline performance on a single Intel X5670 CPU core. Second, a workload distribution scheme based on the speedup feedback is proposed and validated with various workloads. Third, a parallel programming model (MPI+OpenMP/CUDA) is presented and utilized when simulating the radiation physics on large GPU clusters. Finally, we address the computational efficiency issue by exploiting the available computing resources within the Tianhe-1A supercomputer. Experimental results demonstrate that the hybrid version can be accomplished within much less time than that of the CPU counterpart; also, they show similar sensitivity to the temporal resolution of the radiation process.