Clustered element-by-element computations for fluid flow
Parallel computational fluid dynamics
Domain decomposition: parallel multilevel methods for elliptic partial differential equations
Domain decomposition: parallel multilevel methods for elliptic partial differential equations
BILUTM: A Domain-Based Multilevel Block ILUT Preconditioner for General Sparse Matrices
SIAM Journal on Matrix Analysis and Applications
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Numerical Linear Algebra for High Performance Computers
Numerical Linear Algebra for High Performance Computers
Communication Bandwidth of Parallel Programming Models on Hybrid Architectures
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Macro-Micro Economic System Simulation
Proceedings of the 21st International Workshop on Principles of Advanced and Distributed Simulation
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
SIAM Journal on Scientific Computing
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Hi-index | 0.00 |
An efficient parallel iterative method with selective blocking preconditioning has been developed for symmetric multiprocessor (SMP) cluster architectures with vector processors such as the Earth Simulator. This method is based on a three-level hybrid parallel programming model, which includes message passing for inter-SMP node communication, loop directives by OpenMP for intra-SMP node parallelization and vectorization for each processing element (PE). This method provides robust and smooth convergence and excellent vector and parallel performance in 3D geophysical simulations with contact conditions performed on the Earth Simulator. The selective blocking preconditioning is much more efficient than ILU(1) and ILU(2). Performance for the complicated Southwest Japan model with more than 23 M DOF on 10 SMP nodes (80 PEs) of the Earth Simulator was 161.7 GFLOPS, corresponding to 25.3% of the peak performance for hybrid programming model, and 190.4 GFLOPS (29.8% of the peak performance) for flat MPI, respectively.