New Connectivity and MSF Algorithms for Shuffle-Exchange Network and PRAM
IEEE Transactions on Computers
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Information and Computation
An optimal randomized parallel algorithm for finding connected components in a graph
SIAM Journal on Computing
Connected components in O(lg3/2|V|) parallel time for the CREW PRAM (extended abstract)
SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
A parallel algorithm for computing minimum spanning trees
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
A comparison of parallel algorithms for connected components
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
A randomized linear-time algorithm to find minimum spanning trees
Journal of the ACM (JACM)
Finding connected components in O(log n log log n) time on the EREW PRAM
SODA '93 Selected papers from the fourth annual ACM SIAM symposium on Discrete algorithms
An efficient and fast parallel-connected component algorithm
Journal of the ACM (JACM)
LEDA: a platform for combinatorial and geometric computing
LEDA: a platform for combinatorial and geometric computing
Efficient parallel algorithms for some graph problems
Communications of the ACM
Computing connected components on parallel computers
Communications of the ACM
Prefix computations on symmetric multiprocessors
Journal of Parallel and Distributed Computing
Starfire: Extending the SMP Envelope
IEEE Micro
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Parallel Implementation of Borvka's Minimum Spanning Tree Algorithm
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Designing Practical Efficient Algorithms for Symmetric Multiprocessors
ALENEX '99 Selected papers from the International Workshop on Algorithm Engineering and Experimentation
Using PRAM Algorithms on a Uniform-Memory-Access Shared-Memory Architecture
WAE '01 Proceedings of the 5th International Workshop on Algorithm Engineering
Practical Parallel Algorithms for Minimum Spanning Trees
SRDS '98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems
IEEE Communications Magazine
Designing irregular parallel algorithms with mutual exclusion and lock-free protocols
Journal of Parallel and Distributed Computing
SPENK: adding another level of parallelism on the cell broadband engine
IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Fast and scalable list ranking on the GPU
Proceedings of the 23rd international conference on Supercomputing
Parallel Clustering Algorithm for Large Data Sets with Applications in Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Fast minimum spanning tree for large graphs on the GPU
Proceedings of the Conference on High Performance Graphics 2009
Accomplishing approximate FCFS fairness without queues
HiPC'07 Proceedings of the 14th international conference on High performance computing
Petascale computing for large-scale graph problems
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
A scalable parallel union-find algorithm for distributed memory computers
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
High-Performance algorithm engineering for large-scale graph problems and computational biology
WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Performance, scalability, and semantics of concurrent FIFO queues
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part I
Fast RMWs for TSO: semantics and implementation
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the ACM International Conference on Computing Frontiers
Fence-free work stealing on bounded TSO processors
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
The ability to provide uniform shared-memory access to a significant number of processors in a single SMP node brings us much closer to the ideal PRAM parallel computer. Many PRAM algorithms can be adapted to SMPs with few modifications. Yet there are few studies that deal with the implementation and performance issues of running PRAM-style algorithms on SMPs. Our study in this paper focuses on implementing parallel spanning tree algorithms on SMPs. Spanning tree is an important problem in the sense that it is the building block for many other parallel graph algorithms and also because it is representative of a large class of irregular combinatorial problems that have simple and efficient sequential implementations and fast PRAM algorithms, but these irregular problems often have no known efficient parallel implementations. Experimental studies have been conducted on related problems (minimum spanning tree and connected components) using parallel computers, but only achieved reasonable speedup on regular graph topologies that can be implicitly partitioned with good locality features or on very dense graphs with limited numbers of vertices. In this paper we present a new randomized algorithm and implementation with superior performance that for the first time achieves parallel speedup on arbitrary graphs (both regular and irregular topologies) when compared with the best sequential implementation for finding a spanning tree. This new algorithm uses several techniques to give an expected running time that scales linearly with the number p of processors for suitably large inputs (np^2). As the spanning tree problem is notoriously hard for any parallel implementation to achieve reasonable speedup, our study may shed new light on implementing PRAM algorithms for shared-memory parallel computers. The main results of this paper are1.A new and practical spanning tree algorithm for symmetric multiprocessors that exhibits parallel speedups on graphs with regular and irregular topologies; and 2.an experimental study of parallel spanning tree algorithms that reveals the superior performance of our new approach compared with the previous algorithms. The source code for these algorithms is freely-available from our web site. pc.ece.unm.edu.