Portable profiling and tracing for parallel, scientific applications using C++
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Sparse matrix solvers on the GPU: conjugate gradients and multigrid
ACM SIGGRAPH 2003 Papers
Acceleration Techniques for GPU-based Volume Rendering
Proceedings of the 14th IEEE Visualization 2003 (VIS'03)
Efficient gather and scatter operations on graphics processors
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
GPU acceleration of cutoff pair potentials for molecular modeling applications
Proceedings of the 5th conference on Computing frontiers
Maestro: data orchestration and tuning for OpenCL devices
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Analyzing program flow within a many-kernel OpenCL application
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
CPU/GPU computing for long-wave radiation physics on large GPU clusters
Computers & Geosciences
Accelerating moderately stiff chemical kinetics in reactive-flow simulations using GPUs
Journal of Computational Physics
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics
The Journal of Supercomputing
Automatic identification of application I/O signatures from noisy server-side traces
FAST'14 Proceedings of the 12th USENIX conference on File and Storage Technologies
Hi-index | 0.00 |
The graphics processor (GPU) has evolved into an appealing choice for high performance computing due to its superior memory bandwidth, raw processing power, and flexible programmability. As such, GPUs represent an excellent platform for accelerating scientific applications. This paper explores a methodology for identifying applications which present significant potential for acceleration. In particular, this work focuses on experiences from accelerating S3D, a high-fidelity turbulent reacting flow solver. The acceleration process is examined from a holistic viewpoint, and includes details that arise from different phases of the conversion. This paper also addresses the issue of floating point accuracy and precision on the GPU, a topic of immense importance to scientific computing. Several performance experiments are conducted, and results are presented from the NVIDIA Tesla C1060 GPU. We generalize from our experiences to provide a roadmap for deploying existing scientific applications on heterogeneous GPU platforms.