A scalable barotropic mode solver for the parallel ocean program

Authors:
Yong Hu;Xiaomeng Huang;Xiaoge Wang;Haohuan Fu;Shizhen Xu;Huabin Ruan;Wei Xue;Guangwen Yang
Affiliations:
Ministry of Education Key Laboratory for Earth System Modeling, Center for Earth System Science, Tsinghua University, Beijing, China,Tsinghua National Laboratory for Information Science and Techno ...;Ministry of Education Key Laboratory for Earth System Modeling, Center for Earth System Science, Tsinghua University, Beijing, China;Tsinghua National Laboratory for Information Science and Technology (TNList), China;Ministry of Education Key Laboratory for Earth System Modeling, Center for Earth System Science, Tsinghua University, Beijing, China;Ministry of Education Key Laboratory for Earth System Modeling, Center for Earth System Science, Tsinghua University, Beijing, China,Tsinghua National Laboratory for Information Science and Techno ...;Ministry of Education Key Laboratory for Earth System Modeling, Center for Earth System Science, Tsinghua University, Beijing, China,Tsinghua National Laboratory for Information Science and Techno ...;Ministry of Education Key Laboratory for Earth System Modeling, Center for Earth System Science, Tsinghua University, Beijing, China,Tsinghua National Laboratory for Information Science and Techno ...;Ministry of Education Key Laboratory for Earth System Modeling, Center for Earth System Science, Tsinghua University, Beijing, China,Tsinghua National Laboratory for Information Science and Techno ...
Venue:
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Year:
2013

Citing 6
Cited 0

The Chebyshev iteration revisited

Parallel Computing - Parallel matrix algorithms and applications
Practical performance portability in the Parallel Ocean Program (POP): Research Articles

Concurrency and Computation: Practice & Experience - The High Performance Architectural Challenge: Mass Market versus Proprietary Components?
The Tau Parallel Performance System

International Journal of High Performance Computing Applications
Scaling climate simulation applications on the IBM Blue Gene/L system

IBM Journal of Research and Development
Performance of the community earth system model

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Computational performance of ultra-high-resolution capability in the Community Earth System Model

International Journal of High Performance Computing Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper represents a novel strategy to improve the scalability of the barotropic mode in the Parallel Ocean Program (POP), by theoretically analyzing the barotropic communications bottleneck. POP discretizes the elliptic equations of the barotropic mode into a linear system Ax=b and solves it using the Preconditioned Conjugate Gradient (PCG) method. PCG scales poorly on distributed systems because of the time-consuming global reductions needed by the inner products in each iteration. A performance model is developed to quantify the scaling bottleneck of PCG. Based on this model, the classical Stiefel iteration (CSI), which was originally supposed to be less efficient than PCG, is identified as being promising for massive parallelism. In contrast to PCG, the recurrence parameters of CSI are determined by the spectrum of the coefficient matrix A instead of the inner product of the residuals in previous iterations. The Lanczos method is used to resolve the difficulty of estimating the eigenvalues of the large-scale matrix A. It constructs a small-scale tridiagonal matrix that has eigenvalues close to A. By replacing PCG with CSI, global reductions and their inherent poor scalability are eliminated in the barotropic mode. The implementation of CSI in POP with a 0.1 degree resolution can accerlate one barotropic step by five times, from 1.23s to 0.26s, on 15,000 cores.