On Identifying Strongly Connected Components in Parallel
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Finding strongly connected components in parallel using O(log2n) reachability queries
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Numerical Recipes 3rd Edition: The Art of Scientific Computing
Numerical Recipes 3rd Edition: The Art of Scientific Computing
Gpu gems 3
Introduction to Algorithms, Third Edition
Introduction to Algorithms, Third Edition
Parallel algorithms for finding SCCs in implicitly given graphs
FMICS'06/PDMC'06 Proceedings of the 11th international workshop, FMICS 2006 and 5th international workshop, PDMC conference on Formal methods: Applications and technology
Accelerating large graph algorithms on the GPU using CUDA
HiPC'07 Proceedings of the 14th international conference on High performance computing
An effective GPU implementation of breadth-first search
Proceedings of the 47th Design Automation Conference
Accelerating CUDA graph algorithms at maximum warp
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
Distributed Algorithms for SCC Decomposition
Journal of Logic and Computation
Computing Strongly Connected Components in Parallel on CUDA
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Design and implementation of the HPCS graph analysis benchmark on symmetric multiprocessors
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
GPU Acceleration on Embedded Devices. A Power Consumption Approach
HPCC '12 Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems
Hi-index | 0.00 |
The GPU (Graphics Processing Unit) has recently become one of the most power efficient processors in embedded and many other environments, and has been integrated into more and more SoCs (System on Chip). Thus modern GPUs play a very important role in power aware computing. Strongly Connected Component (SCC) decomposition is a fundamental graph algorithm which has wide applications in model checking, electronic design automation, social network analysis and other fields. GPUs have been shown to have great potential in accelerating many types of computations including graph algorithms. Recent work have demonstrated the plausibility of GPU SCC decomposition, but the implementation is inefficient due to insufficient consideration of the distinguishing GPU programming model, which leads to poor performance on irregular and sparse graphs. This paper presents a new GPU SCC decomposition algorithm that focuses on full utilization of the contemporary embedded and desktop GPU architecture. In particular, a subgraph numbering scheme is proposed to facilitate the safe and efficient management of the subgraph IDs and to serve as the basis of efficient source selection. Furthermore, we adopt a multi-source partition procedure that greatly reduces the recursion depth and use a vertex labeling approach that can highly optimize the GPU memory access. The evaluation results show that the proposed approach achieves up to 41x speedup over Tarjan's algorithm, one of the most efficient sequential SCC decomposition algorithms, and up to 3.8x speedup over the previous GPU algorithms.