A performance model of non-deterministic particle transport on large-scale systems

  • Authors:
  • Mark M. Mathis;Darren J. Kerbyson;Adolfy Hoisie

  • Affiliations:
  • Performance and Architecture Lab (PAL), Los Alamos National Laboratory, Los Alamos, NM and Department of Computer Science, Texas A&M University, College Station, TX;Performance and Architecture Lab (PAL), Los Alamos National Laboratory, Los Alamos, NM;Performance and Architecture Lab (PAL), Los Alamos National Laboratory, Los Alamos, NM

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work we present a predictive analytical model that encompasses the performance and scaling characteristics of a nondeterministic particle transport application, MCNP (Monte Carlo N-Particle), that represents part of the Advanced Simulation and Computing (ASC) workload. MCNP can be used for the simulation of neutron, photon, electron, or coupled transport, and has found uses in many problem areas including nuclear reactors, radiation shielding, and medical physics. Monte Carlo methods in general and MCNP specifically do not solve an explicit equation, but rather obtain answers by simulating the interactions between individual particles and a predefined geometry. This is in contrast to deterministic transport methods, the most common of which is the discrete ordinates method, that solve the transport equation directly for the average particle behavior.Previous studies on the scalability of parallel Monte Carlo calculations have been rather general in nature. The performance model developed here is both detailed and parametric with both application characteristics (e.g. problem size), and system characteristics (e.g. communication latency, bandwidth, achieved processing rate) serving as input. The model is validated against measurements on an AlphaServer ES40 system showing high accuracy across many processor/problem combinations. The model is then used to provide insight into the achievable performance that should be possible on systems containing thousands of processors and to quantify the impact that possible improvements in sub-system performance may have. In addition, the impact on performance of modifying the communication structure of the code is also quantified.