Hybrid MPI-cell parallelism for hyperbolic PDE simulation on a cell processor cluster

  • Authors:
  • Scott Rostrup;Hans De Sterck

  • Affiliations:
  • Department of Applied Mathematics, University of Waterloo, Ontario, Canada;Department of Applied Mathematics, University of Waterloo, Ontario, Canada

  • Venue:
  • HPCS'09 Proceedings of the 23rd international conference on High Performance Computing Systems and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show how a numerical simulation method for nonlinear hyperbolic partial differential equation (PDE) systems on structured grids with explicit timestepping can be implemented efficiently for the Cell processor and for clusters of Cell processors. We describe memory layout, communication patterns and optimization steps that are performed to exploit the parallel architecture of the Cell processor. A second layer of Message Passing Interface (MPI) parallelism is added to obtain a hybrid parallel code that can be executed efficiently on Cell clusters. Performance tests are conducted on a Cell cluster, and the Cell performance is compared with x86 performance (Xeon). Compared with single-core Xeon performance, the Cell processor obtains significant speed-ups of 60x for single precision calculations, and 20x for double precision. In a chip-to-chip comparison, the Cell code is 14x faster than a 4-core Xeon (using pthreads) in single precision, and 5x faster in double precision. Parallel cluster scaling results were hampered by a relatively slow interconnect on our test system, but overall our study shows how Cell clusters can be used efficiently for simulating nonlinear hyperbolic PDE systems.