On Identifying Strongly Connected Components in Parallel
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance modeling of deterministic transport computations
Performance analysis and grid computing
Parallel Flux Sweep Algorithm for Neutron Transport on Unstructured Grid
The Journal of Supercomputing
Provable Algorithms for Parallel Sweep Scheduling on Unstructured Meshes
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
A General Performance Model of Structured and Unstructured Mesh Particle Transport Computations
The Journal of Supercomputing
Finding strongly connected components in distributed graphs
Journal of Parallel and Distributed Computing
A performance model of non-deterministic particle transport on large-scale systems
Future Generation Computer Systems
Provable algorithms for parallel generalized sweep scheduling
Journal of Parallel and Distributed Computing
Parallel iterative difference schemes based on prediction techniques for Sn transport method
Applied Numerical Mathematics
Finding strongly connected components in parallel using O(log2n) reachability queries
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
A performance model of non-deterministic particle transport on large-scale systems
Future Generation Computer Systems
Towards a parallel framework of grid-based numerical algorithms on DAGs
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
The method of discrete ordinates is commonly used to solve the Boltzmann radiation transport equation for applications ranging from simulations of fires to weapons effects. The equations are most efficiently solved by sweeping the radiation flux across the computational grid. For unstructured grids this poses several interesting challenges, particularly when implemented on distributed-memory parallel machines where the grid geometry is spread across processors. We describe an asynchronous, parallel, message-passing algorithm that performs sweeps simultaneously from many directions across unstructured grids. We identify key factors that limit the algorithm's parallel scalability and discuss two enhancements we have made to the basic algorithm: one to prioritize the work within a processor's subdomain and the other to better decompose the unstructured grid across processors. Performance results are given for the basic and enhanced algorithms implemented within a radiation solver running on hundreds of processors of Sandia's Intel Tflops machine and DEC-Alpha CPlant cluster.