SIGGRAPH '05 ACM SIGGRAPH 2005 Sketches
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Accelerating advanced mri reconstructions on gpus
Proceedings of the 5th conference on Computing frontiers
Biomedical image analysis on a cooperative cluster of GPUs and multicores
Proceedings of the 22nd annual international conference on Supercomputing
Benchmarking GPUs to tune dense linear algebra
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Solving dense linear systems on platforms with multiple hardware accelerators
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Accelerating linpack with CUDA on heterogenous clusters
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Solving the euler equations on graphics processing units
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Hi-index | 0.00 |
Recently GPU is widely utilized in scientific computing and engineering applications, owing primarily to the evolution of GPU architecture. Firstly, we analyze some key performance characters of GPU in detail, and the relationships among GPU architecture, programming model and memory hierarchy. Secondly, we present three performance optimization strategies: Prefetching, Streamlizing, and Task Division. Adequate experiments have been done to abstract the relationships among different factors and efficiency. Finally, we map the HPL benchmark to testify our strategies and achieve certain speedup.