Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A Portable Programming Interface for Performance Evaluation on Modern Processors
International Journal of High Performance Computing Applications
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
3D finite difference computation on GPUs using CUDA
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Multi-core acceleration of chemical kinetics for simulation and prediction
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Auto-tuning 3-D FFT library for CUDA GPUs
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Performance analysis of a hybrid MPI/CUDA implementation of the NASLU benchmark
ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
Performance evaluations of gyrokinetic Eulerian code GT5D on massively parallel multi-core platforms
State of the Practice Reports
Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
CPU/GPU computing for long-wave radiation physics on large GPU clusters
Computers & Geosciences
Gdev: first-class GPU resource management in the operating system
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Partitioning and multi-core parallelization of multi-equation forecast models
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
A peta-scalable CPU-GPU algorithm for global atmospheric simulations
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Multi-GPU implementation of the NICAM atmospheric model
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
GPU acceleration of the WSM6 cloud microphysics scheme in GRAPES model
Computers & Geosciences
Hi-index | 0.00 |
Regional weather forecasting demands fast simulation over fine-grained grids, resulting in extremely memory- bottlenecked computation, a difficult problem on conventional supercomputers. Early work on accelerating mainstream weather code WRF using GPUs with their high memory performance, however, resulted in only minor speedup due to partial GPU porting of the huge code. Our full CUDA porting of the high- resolution weather prediction model ASUCA is the first such one we know to date; ASUCA is a next-generation, production weather code developed by the Japan Meteorological Agency, similar to WRF in the underlying physics (non-hydrostatic model). Benchmark on the 528 (NVIDIA GT200 Tesla) GPU TSUBAME Supercomputer at the Tokyo Institute of Technology demonstrated over 80-fold speedup and good weak scaling achieving 15.0 TFlops in single precision for 6956 x 6052 x 48 mesh. Further benchmarks on TSUBAME 2.0, which will embody over 4000 NVIDIA Fermi GPUs and deployed in October 2010, will be presented.