A randomized linear-time algorithm to find minimum spanning trees
Journal of the ACM (JACM)
Backwards analysis of the Karger-Klein-Tarjan algorithm for minimum spanning trees
Information Processing Letters
A minimum spanning tree algorithm with inverse-Ackermann type complexity
Journal of the ACM (JACM)
Parallel Implementation of Borvka's Minimum Spanning Tree Algorithm
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Functional Approach to External Graph Algorithms
ESA '98 Proceedings of the 6th Annual European Symposium on Algorithms
Fast shared-memory algorithms for computing the minimum spanning forest of sparse graphs
Journal of Parallel and Distributed Computing
Fast minimum spanning tree for large graphs on the GPU
Proceedings of the Conference on High Performance Graphics 2009
Introduction to Algorithms, Third Edition
Introduction to Algorithms, Third Edition
Implementing sparse matrix-vector multiplication on throughput-oriented processors
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Fast PGAS Implementation of Distributed Graph Algorithms
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
The university of Florida sparse matrix collection
ACM Transactions on Mathematical Software (TOMS)
Research paper: The saga of minimum spanning trees
Computer Science Review
MCSTL: the multi-core standard template library
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
The GPU is an efficient accelerator for regular data-parallel workloads, but GPU acceleration is more difficult for graph algorithms and other applications with irregular memory access patterns and large memory footprints. The minimum spanning tree MST problem arises in a variety of applications and its solution exemplifies the difficulties of mapping irregular algorithms to the GPU. In this paper, we present a memory-efficient parallel algorithm for finding the minimum spanning tree of very large graphs by introducing a data-parallel implementation of Kruskal's algorithm. We test scalability and performance on random and real-world graphs with up to 25 million vertices and 240 million edges on an Nvidia Tesla T10 GPU with 4GB of memory. Our method can process graphs 4X larger and up to 10X faster than was possible with the recently published implementation of Boruvka's MST algorithm for the GPU. We also demonstrate the performance advantage of the proposed method against the multi-core filter-Kruskal's MST algorithm on a dual quad-core CPU server with Nehalem X5550 processors.