Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
The fuzzy barrier: a mechanism for high speed synchronization of processors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Fast barrier synchronization hardware
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Deadlock-free multicast wormhole routing in multicomputer networks
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Planar-adaptive routing: low-cost adaptive networks for multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The turn model for adaptive routing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Subset barrier synchronization on a private-memory parallel system
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Multi-address Encoding for Multicast
PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
Multidestination Message Passing Mechanism Conforming to Base Wormhole Routing Scheme
PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
Integrated Network Barriers for D-Dimensional Meshes
PACT '93 Proceedings of the IFIP WG10.3. Working Conference on Architectures and Compilation Techniques for Fine and Medium Grain Parallelism
Designing Tree-Based Barrier Synchronization on 2D Mesh Networks
IEEE Transactions on Parallel and Distributed Systems
Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths
IEEE Transactions on Parallel and Distributed Systems
Barrier Synchronization on Wormhole-Routed Networks
IEEE Transactions on Parallel and Distributed Systems
Four-Ary Tree-Based Barrier Synchronization for 2D Meshes without Nonmember Involvement
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
A Reliable Hardware Barrier Synchronization Scheme
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Performance of Fault-Tolerant Distributed Shared Memory on Broadcast- and Switch-Based Architectures
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 14 - Volume 15
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
This paper presents a new approach to implement fast barrier synchronization in wormhole k-ary n-cubes. The novelty lies in using multidestination messages instead of the traditional single destination messages. Two different multidestination worm types, gather and broadcasting, are introduced to implement the report and wake-up phases of barrier synchronization, respectively. Algorithms for complete and arbitrary set barrier synchronization are presented using these new worms. It is shown that complete barrier synchronization in a k-ary n-cube system with e-cube routing can be implemented with 2n communication start-ups as compared to 2n log/sub 2/ k start-ups needed with unicast-based message passing. For arbitrary set barrier, an interesting trend is observed where the synchronization cost keeps on reducing beyond a certain number of participating nodes.