GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems
SIAM Journal on Scientific and Statistical Computing
Efficient distributed mesh data structure for parallel automated adaptive analysis
Engineering with Computers
Overview of the IBM Blue Gene/P project
IBM Journal of Research and Development
Adaptive boundary layer meshing for viscous flow simulations
Engineering with Computers - Special Issue: 5th Symposium on Trends in Unstructured Mesh Generation in 2006. Guest Editor: Steven J. Owen
Evaluation of message passing communication patterns in finite element solution of coupled problems
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Hybrid programming model for implicit PDE simulations on multicore architectures
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Scalability studies and large grid computations for surface combatant using CFDShip-Iowa
International Journal of High Performance Computing Applications
Controlling Unstructured Mesh Partitions for Massively Parallel Simulations
SIAM Journal on Scientific Computing
Unstructured mesh partition improvement for implicit finite element at extreme scale
The Journal of Supercomputing
Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion
Hybrid parallelization of a large-scale heart model
Facing the Multicore-Challenge II
Hi-index | 0.00 |
Implicit methods for partial differential equations using unstructured meshes allow for an efficient solution strategy for many real-world problems (e.g., simulation-based virtual surgical planning). Scalable solvers employing these methods not only enable solution of extremely-large practical problems but also lead to dramatic compression in time-to-solution. We present a parallelization paradigm and associated procedures that enable our implicit, unstructured flow-solver to achieve strong scalability. We consider fluid-flow examples in two application areas to show the effectiveness of our procedures that yield near-perfect strong-scaling on various (including near-petascale) systems. The first area includes a double-throat nozzle (DTN) whereas the second considers a patient-specific abdominal aortic aneurysm (AAA) model. We present excellent strong-scaling on three cases ranging from relatively small to large; a DTN model with O(106) elements up to 8,192 cores (9 core-doublings), an AAA model with O(108) elements up to 32,768 cores (6 core-doublings) and O(109) elements up to 163,840 cores.