Implementation and performance modeling of deterministic particle transport (Sweep3D) on the IBM Cell/B.E.

  • Authors:
  • Olaf Lubeck;Michael Lang;Ram Srinivasan;Greg Johnson

  • Affiliations:
  • Los Alamos National Laboratory, Los Alamos, NM, USA;(Corresponding author: Michael Lang, Los Alamos National Laboratory, TA3 Bldg 2011, Los Alamos, NM 87545, USA. Tel.: +1 505 665 5756/ Fax: +1 505 665 4939/ E-mail: mlang@lanl.gov) Los Alamos Natio ...;Intel Fort Collins, CO, USA;Google Mountain View, CA, USA

  • Venue:
  • Scientific Programming - High Performance Computing with the Cell Broadband Engine
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The IBM Cell Broadband Engine (BE) is a novel multi-core chip with the potential for the demanding floating point performance that is required for high-fidelity scientific simulations. However, data movement within the chip can be a major challenge to realizing the benefits of the peak floating point rates. In this paper, we present the results of implementing Sweep3D on the Cell/B.E. using an intra-chip message passing model that minimizes data movement. We compare the advantages/disadvantages of this programming model with a previous implementation using a master-worker threading strategy. We apply a previously validated micro-architecture performance model for the application executing on the Cell/B.E. (based on our previous work in Monte Carlo performance models), that predicts overall CPI (cycles per instruction), and gives a detailed breakdown of processor stalls. Finally, we use the micro-architecture model to assess the performance of future design parameters for the Cell/B.E. micro-architecture. The methodologies and results have broader implications that extend to multi-core architectures.