SIAM Journal on Computing
Concurrent Access of Priority Queues
IEEE Transactions on Computers
An optimal algorithm for deleting the roof of a heap
Information Processing Letters
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Introduction to parallel algorithms and architectures: array, trees, hypercubes
Information Processing Letters
An introduction to parallel algorithms
An introduction to parallel algorithms
Sorting with Linear Speedup on a Pipelined Hypercube
IEEE Transactions on Computers
Parallel heap: an optimal parallel priority queue
The Journal of Supercomputing
Pipelined parallel prefix computations, and sorting on a pipelined hypercube
Journal of Parallel and Distributed Computing
Parallel algorithms for priority queue operations
Theoretical Computer Science
Optimal Parallel Initialization Algorithms for a Class of Priority Queues
IEEE Transactions on Parallel and Distributed Systems
Fast and Efficient Operations on Parallel Priority Queues
ISAAC '94 Proceedings of the 5th International Symposium on Algorithms and Computation
Load Balances Priority Queues on Distributed Memory Machines
PARLE '94 Proceedings of the 6th International PARLE Conference on Parallel Architectures and Languages Europe
Conflict-free data access of arrays and trees in parallel memory systems
SPDP '94 Proceedings of the 1994 6th IEEE Symposium on Parallel and Distributed Processing
Conflict-free template access in k-ary and binomial trees
ICS '97 Proceedings of the 11th international conference on Supercomputing
Optimal Tree Access by Elementary and Composite Templates in Parallel Memory Systems
IEEE Transactions on Parallel and Distributed Systems
Journal of Parallel and Distributed Computing
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Optimal Tree Access by Elementary and Composite Templates in Parallel Memory Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Mappings for Conflict-Free Access of Paths in Elementary Data Structures
COCOON '00 Proceedings of the 6th Annual International Conference on Computing and Combinatorics
Toward a Universal Mapping Algorithm for Accessing Trees in Parallel Memory Systems
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Conflict-free star-access in parallel memory systems
Journal of Parallel and Distributed Computing
Bounded-Collision Memory-Mapping Schemes for Data Structures with Applications to Parallel Memories
IEEE Transactions on Parallel and Distributed Systems
On the implementation of parallel shortest path algorithms on a supercomputer
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.01 |
In this paper, we efficiently map a priority queue on the hypercube architecture in a load balanced manner, with no additional communication overhead, and present optimal parallel algorithms for performing insert and deletemin operations. Two implementations for such operations are proposed on the single-port hypercube model. In a b-bandwidth, n-item priority queue in which every node contains b items in sorted order, the first implementation achieves optimal speed-up of $O({\rm min}\{{\rm log}\,\,n,{\textstyle{{b\,\,{\rm log}\,\,n} \over {{\rm log}\,\,b\,\,+\,\,{\rm log}\,{\rm log}\,\,n}}}\})$ for inserting b presorted items or deleting b smallest items, where $b = O(n^{{1 \mathord{\left/ {\vphantom {1 c}} \right. \kern-\nulldelimiterspace} c}})$ with c 1. In particular, single insertion and deletion operations are cost-optimal and require $O({\textstyle{{{\rm log}\,n} \over p}} + {\rm log} \,\, p)$ time using $O({\textstyle{{{\rm log}^{}\,\,n} \over {{\rm log}\,{\rm log}\,\,n}}})$ processors.The second implementation is more scalable since it uses a larger number of processors, and attains a "nearly" optimal speed-up on the single hypercube. Namely, the insertion of log n presorted items or the deletion of the log n smallest items is accomplished in O(log log n2)time using $O({\textstyle{{{\rm log}^2\,\,n} \over {{\rm log}\,{\rm log}\,\,n}}})$ processors. Finally, on the slightly more powerful pipelined hypercube model, the second implementation performs log n operations in O(log log n) time using $O({\textstyle{{{\rm log}^2\,\,n} \over {{\rm log}\,{\rm log}\,\,n}}})$ processors, thus achieving an optimal speed-up. To the best of our knowledge, our algorithms are the first implementations of b-bandwidth distributed priority queues, which are load balanced and yet guarantee optimal speed-ups.