Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Scalable data parallel implementations of object recognition using geometric hashing
Journal of Parallel and Distributed Computing - Special issue on data parallel algorithms and programming
Optimal multiphase complete exchange on circuit-switched hypercube architectures
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Communication operations on coarse-grained mesh architectures
Parallel Computing
Practical parallel algorithms for personalized communication and integer sorting
Practical parallel algorithms for personalized communication and integer sorting
U-Net: a user-level network interface for parallel and distributed computing
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
A comparison of architectural support for messaging in the TMC CM-5 and the Cray T3D
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Supporting Irregular Distributions Using Data-Parallel Languages
IEEE Parallel & Distributed Technology: Systems & Technology
Multiphase Complete Exchange on Paragon, SP2, and CS-2
IEEE Parallel & Distributed Technology: Systems & Technology
Parallelization of perceptual grouping on distributed memory machines
CAMP '95 Proceedings of the Computer Architectures for Machine Perception
Many-to-many personalized communication with bounded traffic
FRONTIERS '95 Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)
Efficient Algorithms for Block-Cyclic Redistribution of Arrays
SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
Portable and scalable algorithms for irregular all-to-all communication
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Study of interoperability between EFCI and ER switch mechanisms for ABR traffic in an ATM network
ICCCN '95 Proceedings of the 4th International Conference on Computer Communications and Networks
A Message Scheduling Scheme for All-to-All Personalized Communication on Ethernet Switched Clusters
IEEE Transactions on Parallel and Distributed Systems
Fine-Grained Data Distribution Operations for Particle Codes
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hi-index | 0.00 |
In irregular all-to-all communication, messages are exchanged between every pair of processors. The message sizes vary from processor to processor and are known only at run time. This is a fundamental communication primitive in parallelizing irregularly structured scientific computations. Our algorithm reduces the total number of message start-ups. It also reduces node contention by smoothing out the lengths of the messages communicated. As compared to the earlier approaches, our algorithm provides deterministic performance and also reduces the buffer space at the nodes during message passing. The performance of the algorithm is characterised using a simple communication model of high-performance computing (HPC) platforms. We show the implementation on T3D and SP2 using C and the message passing interface standard. These can be easily ported to other HPC platforms. The results show the effectiveness of the proposed technique as well as the interplay among the machine size, the variance in message length, and the network interface.