Performance Portability in the Physical Parameterizations of the Community Atmospheric Model
International Journal of High Performance Computing Applications
A new technique for data privatization in user-level threads and its use in parallel applications
Proceedings of the 2010 ACM Symposium on Applied Computing
Hi-index | 0.00 |
A subgrid orography scheme has been applied to the National Center for Atmospheric Research Community Atmosphere Model. The scheme applies all of the model column physics to each of up to 11 elevation classes within each grid cell. The distribution of the number of elevation classes in each grid cell is highly inhomogeneous. This could produce a serious load imbalance if the domain decomposition distributes grid cells evenly across processors. However, since the distribution of classes is static, static load balancing can be used to distribute the elevation classes uniformly across processors. The load balancing is accomplished first by distributing the number of classes evenly within each process. The number of chunks on processes is distributed uniformly across processes and the dynamics-physics transpose cost is minimized by assigning chunks to processes with the most dynamics grid cells from that chunk. Parallel efficiency with the subgrid scheme and load balancing exceeds parallel efficiency without the subgrid scheme for up to 128 processors. The load balancing across processes decreases run-time by 10-30% depending on configuration.