Scalable heterogeneous parallelism for atmospheric modeling and simulation

Authors:
John C. Linford;Adrian Sandu
Affiliations:
Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, USA 24061;Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, USA 24061
Venue:
The Journal of Supercomputing
Year:
2011

Citing 17
Cited 0

Numerical computation of internal & external flows: fundamentals of numerical discretization

Numerical computation of internal & external flows: fundamentals of numerical discretization
Merrimac: Supercomputing with Streams

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Adjoint sensitivity analysis of regional air quality models

Journal of Computational Physics
The potential of the cell processor for scientific computing

Proceedings of the 3rd conference on Computing frontiers
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Dynamic multigrain parallelization on the cell broadband engine

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
On Characterizing Performance of the Cell Broadband Engine Element Interconnect Bus

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Implementation of mixed precision in solving systems of linear equations on the Cell processor: Research Articles

Concurrency and Computation: Practice & Experience
Executing irregular scientific applications on stream architectures

Proceedings of the 21st annual international conference on Supercomputing
Multilevel parallelization on the cell/B.E. for a motion JPEG 2000 encoding server

Proceedings of the 15th international conference on Multimedia
Mixed Precision Iterative Refinement Techniques for the Solution of Dense Linear Systems

International Journal of High Performance Computing Applications
Dma-based prefetching for i/o-intensive workloads on the cell architecture

Proceedings of the 5th conference on Computing frontiers
Implementing Wilson-Dirac operator on the cell broadband engine

Proceedings of the 22nd annual international conference on Supercomputing
SPADE: the system s declarative stream processing engine

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Optimized Implementation of Ray Tracing on Cell Broadband Engine

MUE '08 Proceedings of the 2008 International Conference on Multimedia and Ubiquitous Engineering
Vector stream processing for effective application of heterogeneous parallelism

Proceedings of the 2009 ACM symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Heterogeneous multicore chipsets with many levels of parallelism are becoming increasingly common in high-performance computing systems. Effective use of parallelism in these new chipsets constitutes the challenge facing a new generation of large scale scientific computing applications. This study examines methods for improving the performance of two-dimensional and three-dimensional atmospheric constituent transport simulation on the Cell Broadband Engine Architecture (CBEA). A function offloading approach is used in a 2D transport module, and a vector stream processing approach is used in a 3D transport module. Two methods for transferring incontiguous data between main memory and accelerator local storage are compared. By leveraging the heterogeneous parallelism of the CBEA, the 3D transport module achieves performance comparable to two nodes of an IBM BlueGene/P, or eight Intel Xeon cores, on a single PowerXCell 8i chip. Module performance on two CBEA systems, an IBM BlueGene/P, and an eight-core shared-memory Intel Xeon workstation are given.