Contention-Aware Communication Schedule for High-Speed Communication

  • Authors:
  • Anthony T. C. Tam;Cho-Li Wang

  • Affiliations:
  • Department of Computer Science and Information Systems, University of Hong Kong, Hong Kong;Department of Computer Science and Information Systems, University of Hong Kong, Hong Kong

  • Venue:
  • Cluster Computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

A lot of efforts have been devoted to address the software overhead problem in the past decade, which is known as the major hindrance on high-speed communication. However, this paper shows that having a low-latency communication system does not guarantee to achieve high performance, as there are other communication issues that have not been fully addressed by the use of low-latency communication, such as contention and scheduling of communication events. In this paper, we use the complete exchange operation as a case study to show that with careful design of communication schedules, we can achieve efficient communication as well as prevent congestion. We have developed a complete exchange algorithm, the Synchronous Shuffle Exchange, which is an optimal algorithm on the non-blocking network. To avoid congestion loss caused by the non-deterministic delays in communication events, a global congestion control scheme is introduced. This scheme coordinates all participating nodes to monitor and regulate the traffic load, which effectively avoids congestion loss and maintains sufficient throughput to maximize the performance. To improve the effectiveness of the congestion control scheme when working on the hierarchical network, we incorporate information on the network topology to devise a contention-aware permutation. This permutation scheme generates a communication schedule, which is both node and switch contention-free as well as distributing the network loads more evenly across the hierarchy. This relieves the congestion build-up at the uplink ports and improves the synchronism of the traffic information exchange between cluster nodes. Performance results of our implementation on a 32-node cluster with various network configurations are examined and reported in this paper.