Application of a hybrid MPI/OpenMP approach for parallel groundwater model calibration using multi-core computers

  • Authors:
  • Guoping Tang;Eduardo F. D'Azevedo;Fan Zhang;Jack C. Parker;David B. Watson;Philip M. Jardine

  • Affiliations:
  • Environmental Sciences Division, Oak Ridge National Laboratory, P.O. Box 2008, MS-6038, Oak Ridge, TN 37831-6038, USA;Computer Science and Mathematics Division, Oak Ridge National Laboratory, P.O. Box 2008, MS-6367, Oak Ridge, TN 37831-6367, USA;Institute of Tibetan Plateau Research, Chinese Academy of Sciences, P.O. Box 2871, Beijing 100085, China;Department of Civil and Environmental Engineering, University of Tennessee, Knoxville, TN 37996, USA;Environmental Sciences Division, Oak Ridge National Laboratory, P.O. Box 2008, MS-6038, Oak Ridge, TN 37831-6038, USA;Department of Biosystems Engineering and Soil Science, University of Tennessee, Knoxville, TN 37996, USA

  • Venue:
  • Computers & Geosciences
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Calibration of groundwater models involves hundreds to thousands of forward solutions, each of which may solve many transient coupled nonlinear partial differential equations, resulting in a computationally intensive problem. We describe a hybrid MPI/OpenMP approach to exploit two levels of parallelisms in software and hardware to reduce calibration time on multi-core computers. HydroGeoChem 5.0 (HGC5) is parallelized using OpenMP for direct solutions for a reactive transport model application, and a field-scale coupled flow and transport model application. In the reactive transport model, a single parallelizable loop is identified to account for over 97% of the total computational time using GPROF. Addition of a few lines of OpenMP compiler directives to the loop yields a speedup of about 10 on a 16-core compute node. For the field-scale model, parallelizable loops in 14 of 174 HGC5 subroutines that require 99% of the execution time are identified. As these loops are parallelized incrementally, the scalability is found to be limited by a loop where Cray PAT detects over 90% cache missing rates. With this loop rewritten, similar speedup as the first application is achieved. The OpenMP-parallelized code can be run efficiently on multiple workstations in a network or multiple compute nodes on a cluster as slaves using parallel PEST to speedup model calibration. To run calibration on clusters as a single task, the Levenberg-Marquardt algorithm is added to HGC5 with the Jacobian calculation and lambda search parallelized using MPI. With this hybrid approach, 100-200 compute cores are used to reduce the calibration time from weeks to a few hours for these two applications. This approach is applicable to most of the existing groundwater model codes for many applications.