Optimum Broadcasting and Personalized Communication in Hypercubes
IEEE Transactions on Computers
LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Unicast-Based Multicast Communication in Wormhole-Routed Networks
IEEE Transactions on Parallel and Distributed Systems
Automatically tuned collective communications
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Computer Networks
Broadcasting on Incomplete Hypercubes
IEEE Transactions on Computers
Algorithms for Supporting Compiled Communication
IEEE Transactions on Parallel and Distributed Systems
Optimal Multicast with Packetization and Network Interface Support
ICPP '97 Proceedings of the international Conference on Parallel Processing
Fast Measurement of LogP Parameters for Message Passing Platforms
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A bandwidth latency tradeoff for broadcast and reduction
Information Processing Letters
Quantifying Locality Effect in Data Access Delay: Memory logP
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Pipelining Broadcasts on Heterogeneous Platforms
IEEE Transactions on Parallel and Distributed Systems
Broadcast Trees for Heterogeneous Platforms
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Performance Analysis of MPI Collective Operations
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15 - Volume 16
Automatic generation and tuning of MPI collective communication routines
Proceedings of the 19th annual international conference on Supercomputing
An MPI prototype for compiled communication on Ethernet switched clusters
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
An empirical study of reliable multicast protocols over Ethernet-connected networks
Performance Evaluation
A Message Scheduling Scheme for All-to-All Personalized Communication on Ethernet Switched Clusters
IEEE Transactions on Parallel and Distributed Systems
Bandwidth efficient all-to-all broadcast on switched clusters
International Journal of Parallel Programming
A study of process arrival patterns for MPI collective operations
International Journal of Parallel Programming
An optimal broadcast algorithm adapted to SMP clusters
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Optimal broadcast for fully connected networks
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Bandwidth optimal all-reduce algorithms for clusters of workstations
Journal of Parallel and Distributed Computing
The LogP and MLogP models for parallel image processing with multi-core microprocessor
Proceedings of the 2010 Symposium on Information and Communication Technology
Hi-index | 0.00 |
By splitting a large broadcast message into segments and broadcasting the segments in a pipelined fashion, pipelined broadcast can achieve high performance in many systems. In this paper, we investigate techniques for efficient pipelined broadcast on clusters connected by multiple Ethernet switches. Specifically, we develop algorithms for computing various contention-free broadcast trees that are suitable for pipelined broadcast on Ethernet switched clusters, extend the parametrized LogP model for predicting appropriate segment sizes for pipelined broadcast, show that the segment sizes computed based on the model yield high performance, and evaluate various pipelined broadcast schemes through experimentation on Ethernet switched clusters with various topologies. The results demonstrate that our techniques are practical and efficient for contemporary fast Ethernet and Giga-bit Ethernet clusters.