Efficient decomposition of strongly connected components on GPUs

  • Authors:
  • Guohui Li;Zhe Zhu;Zhang Cong;Fumin Yang

  • Affiliations:
  • School of Computer Science & Technology, Huazhong University of Science & Technology, China;School of Computer Science & Technology, Huazhong University of Science & Technology, China;School of Mathematics & Computer Science, Wuhan Polytechnic University, China;School of Computer Science & Technology, Huazhong University of Science & Technology, China

  • Venue:
  • Journal of Systems Architecture: the EUROMICRO Journal
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

The GPU (Graphics Processing Unit) has recently become one of the most power efficient processors in embedded and many other environments, and has been integrated into more and more SoCs (System on Chip). Thus modern GPUs play a very important role in power aware computing. Strongly Connected Component (SCC) decomposition is a fundamental graph algorithm which has wide applications in model checking, electronic design automation, social network analysis and other fields. GPUs have been shown to have great potential in accelerating many types of computations including graph algorithms. Recent work have demonstrated the plausibility of GPU SCC decomposition, but the implementation is inefficient due to insufficient consideration of the distinguishing GPU programming model, which leads to poor performance on irregular and sparse graphs. This paper presents a new GPU SCC decomposition algorithm that focuses on full utilization of the contemporary embedded and desktop GPU architecture. In particular, a subgraph numbering scheme is proposed to facilitate the safe and efficient management of the subgraph IDs and to serve as the basis of efficient source selection. Furthermore, we adopt a multi-source partition procedure that greatly reduces the recursion depth and use a vertex labeling approach that can highly optimize the GPU memory access. The evaluation results show that the proposed approach achieves up to 41x speedup over Tarjan's algorithm, one of the most efficient sequential SCC decomposition algorithms, and up to 3.8x speedup over the previous GPU algorithms.