Performance Analysis and Optimization on a Parallel Atmospheric General Circulation Model Code

  • Authors:
  • John Z. Lou;John D. Farrara

  • Affiliations:
  • -;-

  • Venue:
  • IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

An analysis is presented of the primary factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on distributed memory, massively parallel computer systems. Several modifications to the original parallel AGCM code aimed at improving its numerical efficiency, load balance and single node code performance are discussed. The impact of these optimization strategies on the performance on two of the state of the art parallel computers, the Intel Paragon and Cray T3D, is presented and analyzed. It is found that implementation of a load balanced FFT algorithm results in a reduction in overall execution time of approximately 45% compared to the original convolution based algorithm. Preliminary results of the application of a load balancing scheme for the physics part of the AGCM code suggest additional reductions in execution time of 15-20% can be achieved. Finally, several strategies for improving the single node performance of the code are presented, and the results obtained thus far suggest reductions in execution time in the range of 30-40% are possible.