External memory BFS on undirected graphs with bounded degree
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Introduction to Algorithms
A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Designing Multithreaded Algorithms for Breadth-First Search and st-connectivity on the Cray MTA-2
ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
GPU acceleration of cutoff pair potentials for molecular modeling applications
Proceedings of the 5th conference on Computing frontiers
Efficient Breadth-First Search on the Cell/BE Processor
IEEE Transactions on Parallel and Distributed Systems
Taming irregular EDA applications on GPUs
Proceedings of the 2009 International Conference on Computer-Aided Design
Accelerating large graph algorithms on the GPU using CUDA
HiPC'07 Proceedings of the 14th international conference on High performance computing
Parallel breadth-first search on distributed memory systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Exploring high throughput computing paradigm for global routing
Proceedings of the International Conference on Computer-Aided Design
A GPU implementation of inclusion-based points-to analysis
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
FlexBFS: a parallelism-aware implementation of breadth-first search on GPU
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
GPUs as an opportunity for offloading garbage collection
Proceedings of the 2012 international symposium on Memory Management
Breaking the speed and scalability barriers for graph exploration on distributed-memory machines
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Early evaluation of directive-based GPU programming models for productive exascale computing
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Approximate weighted matching on emerging manycore and multithreaded architectures
International Journal of High Performance Computing Applications
GPU accelerated genetic clustering
SEAL'12 Proceedings of the 9th international conference on Simulated Evolution and Learning
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Betweenness centrality on GPUs and heterogeneous architectures
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
Massive data analytics: the graph 500 on IBM Blue Gene/Q
IBM Journal of Research and Development
Efficient decomposition of strongly connected components on GPUs
Journal of Systems Architecture: the EUROMICRO Journal
Simulation of Information Propagation over Complex Networks: Performance Studies on Multi-GPU
DS-RT '13 Proceedings of the 2013 IEEE/ACM 17th International Symposium on Distributed Simulation and Real Time Applications
Hi-index | 0.00 |
Breadth-first search (BFS) has wide applications in electronic design automation (EDA) as well as in other fields. Researchers have tried to accelerate BFS on the GPU, but the two published works are both asymptotically slower than the fastest CPU implementation. In this paper, we present a new GPU implementation of BFS that uses a hierarchical queue management technique and a three-layer kernel arrangement strategy. It guarantees the same computational complexity as the fastest sequential version and can achieve up to 10 times speedup.