Designing Tree-Based Barrier Synchronization on 2D Mesh Networks
IEEE Transactions on Parallel and Distributed Systems
Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths
IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerant Multicast Routing in the Mesh with No Virtual Channels
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Message Passing for Linux Clusters with Gigabit Ethernet Mesh Connections
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
Optimization of MPI collective communication on BlueGene/L systems
Proceedings of the 19th annual international conference on Supercomputing
Sublinear algorithms for penalized logistic regression in massive datasets
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Hi-index | 0.00 |
This paper presents a new approach to implement global reduction operations in wormhole k-ary n-cubes. The novelty lies in using multidestination message passing mechanism instead of single destination (unicast) messages. Using pairwise exchange worms along each dimension, it is shown that complete global reduction and barrier synchronization operations, as defined by the Message Passing Interface (MPI) standard, can be implemented with n communication start-ups as compared to 2n [log/sub 2/ k] start-ups required with unicast-based message passing. Analytical results for different values of communication startup time, system size, and data size are presented and compared with the unicast-based scheme. The analysis indicates that the proposed framework can be effectively used in wormhole-routed systems to achieve fast global reduction without a separate control network.