Textbook examples of recursion
Artificial intelligence and mathematical theory of computation
Translation of serial recursive codes to parallel SIMD codes
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
A SIMD Vectorizing Compiler for Digital Signal Processing Algorithms
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Chap - a SIMD graphics processor
SIGGRAPH '84 Proceedings of the 11th annual conference on Computer graphics and interactive techniques
Vectorization of Multigrid Codes Using SIMD ISA Extensions
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
RPU: a programmable ray processing unit for realtime ray tracing
ACM SIGGRAPH 2005 Papers
Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Stack-based parallel recursion on graphics processors
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Dynamic warp subdivision for integrated branch and memory divergence tolerance
Proceedings of the 37th annual international symposium on Computer architecture
Thread block compaction for efficient SIMT control flow
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Improving GPU performance via large warps and two-level warp scheduling
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
SIMD re-convergence at thread frontiers
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
CPU-assisted GPGPU on fused CPU-GPU architectures
HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Simultaneous branch and warp interweaving for sustained GPU performance
Proceedings of the 39th Annual International Symposium on Computer Architecture
General transformations for GPU execution of tree traversals
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Graphics processing units (GPUs) have rapidly emerged as a very significant player in high performance computing. Single instruction multiple thread (SIMT) pipelines are typically used in GPUs to exploit parallelism and maximize performance. Although support for unstructured control flow has been included in GPUs, efficiently managing thread divergence for arbitrary parallel programs remains a critical challenge. In this paper, we focus on the problem of supporting recursion in modern GPUs. We design and comparatively evaluate various algorithms to manage thread divergence encountered in recursive programs. The results improve upon traditional post-dominator based reconvergence mechanisms designed to handle thread divergence due to control flow within a procedure.