Efficient communication using total-exchange

Authors:
Satish Rao;Torsten Suel;Thanasis Tsantilas;Mark Goudreau
Affiliations:
-;-;-;-
Venue:
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Year:
1995

Citing 0
Cited 8

Practical parallel algorithms for personalized communication and integer sorting

Journal of Experimental Algorithmics (JEA)
Parallel algorithms for personalized communication and sorting with an experimental study (extended abstract)

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Modeling parallel bandwidth: local vs. global restrictions

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Lower Bounds on Communication Loads and Optimal Placements in Torus Networks

IEEE Transactions on Computers
Quick Matrix Multiplication on Clusters of Workstations

Informatica
Thinning protocols for routing h-relations over shared media

Journal of Parallel and Distributed Computing
Fast total-exchange algorithm

ISPA'05 Proceedings of the Third international conference on Parallel and Distributed Processing and Applications
Address-free all-to-all routing in sparse torus

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

A central question in parallel computing is to determine the extent to which one can write parallel programs using a high-level, general-purpose, and architecture-independent programming language and have them executed on a variety of parallel and distributed architectures without sacrificing efficiency. A large body of research suggests that, at least in theory, general-purpose parallel computing is indeed possible provided certain conditions are met: an excess of logical parallelism in the program, and the ability of the target architecture to efficiently realize balanced communication patterns. The canonical example of a balanced communication pattern is an h-relation, in which each processor is the origin and destination of at most h messages. A plethora of protocols has been designed for routing h-relations in a variety of networks. The goal has been to minimize the value of h while guaranteeing delivery of the messages within a time constant factor from optimal. In this paper we describe protocols that meet the most stringent efficiency requirement, namely delivery of messages within time that is a lower order additive term from the best achievable. Such protocols are called 1-optimal. While these protocols achieve 1-optimality only for heavily loaded networks, that is, for large values of h, they are remarkable for their simplicity in that they only use the total-exchange communication primitive. The total-exchange can be realized in many networks using very simple, contention-free, and extremely efficient schemes. The technical contribution of this paper is a protocol to route random h-relations in an N-processor network using /sup h///sub N/(1+o(1))+O(log log N) total-exchange rounds with high probability. Using message duplication, we can improve the bound to /sup h///sub N/(1+o(1))+O(log*N). This improves upon the /sup h///sub N/(1+o(1))+O(log N) bound of Gerbessiotis and Valiant. While our theoretical improvements are modest, our experimental results show an improvement over the protocol of A. Gerebessiotis and L.G. Valiant.