Stack-based parallel recursion on graphics processors

  • Authors:
  • Ke Yang;Bingsheng He;Qiong Luo;Pedro V. Sander;Jiaoying Shi

  • Affiliations:
  • Zhejiang University, Hangzhou, China;Hong Kong University of Science and Technology, Hong Kong, Hong Kong;Hong Kong University of Science and Technology, Hong Kong, Hong Kong;Hong Kong University of Science and Technology, Hong Kong, Hong Kong;Zhejiang University, Hangzhou, China

  • Venue:
  • Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent research has shown promising results on using graphics processing units (GPUs) to accelerate general-purpose computation. However, today's GPUs do not support recursive functions. As a result, for inherently recursive algorithms such as tree traversal, GPU programmers need to explicitly use stacks to emulate the recursion. Parallelizing such stack-based implementation on the GPU increases the programming difficulty; moreover, it is unclear how to improve the efficiency of such parallel implementations. As a first step to address both ease of programming and efficiency issues, we propose three parallel stack implementation alternatives that differ in the granularity of stack sharing. Taking tree traversals as an example, we study the performance tradeoffs between these alternatives and analyze their behaviors in various situations. Our results could be useful to both GPU programmers and GPU compiler writers.