Scans as Primitive Parallel Operations
IEEE Transactions on Computers
An optimal EREW PRAM algorithm for minimum spanning tree verification
Information Processing Letters
A minimum spanning tree algorithm with inverse-Ackermann type complexity
Journal of the ACM (JACM)
Concurrent threads and optimal parallel minimum spanning trees algorithm
Journal of the ACM (JACM)
Parallel Implementation of Borvka's Minimum Spanning Tree Algorithm
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Practical Parallel Algorithms for Minimum Spanning Trees
SRDS '98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems
A fast, parallel spanning tree algorithm for symmetric multiprocessors (SMPs)
Journal of Parallel and Distributed Computing
Scan primitives for GPU computing
Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Fast scan algorithms on graphics processors
Proceedings of the 22nd annual international conference on Supercomputing
Fast parallel GPU-sorting using a hybrid algorithm
Journal of Parallel and Distributed Computing
Power grid analysis with hierarchical support graphs
Proceedings of the International Conference on Computer-Aided Design
Scalable parallel minimum spanning forest computation
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
A compiler-assisted runtime-prefetching scheme for heterogeneous platforms
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
kNN-Borůvka-GPU: a fast and scalable MST construction from kNN graphs on GPU
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part I
Fast and memory-efficient minimum spanning tree on the GPU
International Journal of Computational Science and Engineering
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
StreamScan: fast scan algorithms for GPUs without global barrier synchronization
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Atomic-free irregular computations on GPUs
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
General transformations for GPU execution of tree traversals
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Parallel approximation algorithms for minimum routing cost spanning tree
International Journal of Computational Science and Engineering
Hi-index | 0.00 |
Graphics Processor Units are used for many general purpose processing due to high compute power available on them. Regular, data-parallel algorithms map well to the SIMD architecture of current GPU. Irregular algorithms on discrete structures like graphs are harder to map to them. Efficient data-mapping primitives can play crucial role in mapping such algorithms onto the GPU. In this paper, we present a minimum spanning tree algorithm on Nvidia GPUs under CUDA, as a recursive formulation of Borůvka's approach for undirected graphs. We implement it using scalable primitives such as scan, segmented scan and split. The irregular steps of supervertex formation and recursive graph construction are mapped to primitives like split to categories involving vertex ids and edge weights. We obtain 30 to 50 times speedup over the CPU implementation on most graphs and 3 to 10 times speedup over our previous GPU implementation. We construct the minimum spanning tree on a 5 million node and 30 million edge graph in under 1 second on one quarter of the Tesla S1070 GPU.