Summed-area table algorithm optimization based on the OpenCL
Shengen Yan;Yunquan Zhang;Guoping Long
Proceedings of the ATIP/A*CRC Workshop on Accelerator Technologies for High-Performance Computing: Does Asia Lead the Way?
GPURoofline: a model for guiding performance optimizations on GPUs
Haipeng Jia;Yunquan Zhang;Guoping Long;Jianliang Xu;Shengen Yan;Yan Li
Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
An insightful program performance tuning chain for GPU computing
Haipeng Jia;Yunquan Zhang;Guoping Long;Shengen Yan
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
StreamScan: fast scan algorithms for GPUs without global barrier synchronization
Shengen Yan;Guoping Long;Yunquan Zhang
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
yaSpMV: yet another SpMV framework on GPUs
Shengen Yan;Chao Li;Yunquan Zhang;Huiyang Zhou
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 412.52 |
H-index | 1 |
P-index | 0.6667 |
up-index | 0.7333 |