A bridging model for parallel computation
Communications of the ACM
Scalability Analysis of Multidimensional Wavefront Algorithms on Large-Scale SMP Clusters
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
A General Performance Model of Structured and Unstructured Mesh Particle Transport Computations
The Journal of Supercomputing
A performance model of non-deterministic particle transport on large-scale systems
Future Generation Computer Systems
Provable algorithms for parallel generalized sweep scheduling
Journal of Parallel and Distributed Computing
A performance model of non-deterministic particle transport on large-scale systems
Future Generation Computer Systems
A performance model of non-deterministic particle transport on large-scale systems
ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
GPU accelerated simulations of 3D deterministic particle transport using discrete ordinates method
Journal of Computational Physics
Optimizing sweep3d for graphic processor unit
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Performance Modeling and Comparative Analysis of the MILC Lattice QCD Application su3_rmd
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Hi-index | 0.01 |
The key contribution of this paper is the first general model which can be used to predict the running time of transport sweeps on orthogonal grids for any regular mapping of the grid cells to processors. Our model, which accounts for machine dependent parameters such as computation cost and communication latency, can be used to analyze and compare the effects of various spatial decompositions on the running time of the transport sweep. Insight obtained from the model yields two significant contributions to the theory of optimal transport sweeps on orthogonal grids. First, our model provides a theoretical basis which explains why, and under what circumstances, the column decomposition of the current standard KBA algorithm is superior to the 'balanced' decomposition obtained by classic domain decomposition techniques. Second, our model enables us to identify a new decomposition, we call Hybrid, which proves to be almost as good as, and sometimes superior to, the current standard KBA method. Our analysis covers sweeps in two- and three-dimensional spatial domains, and first considers sweeps in only one direction, and then sweeps involving multiple simultaneous directions. We obtain expressions for the completion time and discuss theoretical results.