A performance model of non-deterministic particle transport on large-scale systems

Authors:
Mark M. Mathis;Darren J. Kerbyson;Adolfy Hoisie
Affiliations:
Performance and Architecture Lab (PAL), Los Alamos National Laboratory, Los Alamos, NM and Department of Computer Science, Texas A&M University, College Station, TX;Performance and Architecture Lab (PAL), Los Alamos National Laboratory, Los Alamos, NM;Performance and Architecture Lab (PAL), Los Alamos National Laboratory, Los Alamos, NM
Venue:
Future Generation Computer Systems
Year:
2006

Citing 12
Cited 1

Parallel processing: the Cm* experience

Parallel processing: the Cm* experience
Dependence flow graphs: an algebraic approach to program dependencies

POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A general performance model for parallel sweeps on orthogonal grids for particle transport calculations

Proceedings of the 14th international conference on Supercomputing
Parallel algorithms for radiation transport on unstructured grids

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Predictive performance and scalability modeling of a large-scale application

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
The Quadrics Network: High-Performance Clustering Technology

IEEE Micro
Exploring Advanced Architectures Using Performance Prediction

IWIA '02 Proceedings of the International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems (IWIA'02)
Parallel program performance prediction using deterministic task graph analysis

ACM Transactions on Computer Systems (TOCS)
Performance modeling of deterministic transport computations

Performance analysis and grid computing
Verifying large-scale system performance during installation using modelling

High performance scientific and engineering computing
Performance and Scalability Analysis of Teraflop-Scale Parallel Architectures Using Multidimensional Wavefront Applications

International Journal of High Performance Computing Applications
A performance model of non-deterministic particle transport on large-scale systems

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII

Performance modeling and optimization of a high energy colliding beam simulation code

Proceedings of the 2006 ACM/IEEE conference on Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work we present a predictive analytical model that encompasses the performance and scaling characteristics of a nondeterministic particle transport application, MCNP (Monte Carlo N-Particle), that represents part of the Advanced Simulation and Computing (ASC) workload. MCNP can be used for the simulation of neutron, photon, electron, or coupled transport, and has found uses in many problem areas including nuclear reactors, radiation shielding, and medical physics. Monte Carlo methods in general and MCNP specifically do not solve an explicit equation, but rather obtain answers by simulating the interactions between individual particles and a predefined geometry. This is in contrast to deterministic transport methods, the most common of which is the discrete ordinates method, that solve the transport equation directly for the average particle behavior.Previous studies on the scalability of parallel Monte Carlo calculations have been rather general in nature. The performance model developed here is both detailed and parametric with both application characteristics (e.g. problem size), and system characteristics (e.g. communication latency, bandwidth, achieved processing rate) serving as input. The model is validated against measurements on an AlphaServer ES40 system showing high accuracy across many processor/problem combinations. The model is then used to provide insight into the achievable performance that should be possible on systems containing thousands of processors and to quantify the impact that possible improvements in sub-system performance may have. In addition, the impact on performance of modifying the communication structure of the code is also quantified.