Separated high-bandwidth and low-latency communication in the cluster interconnect clint

Authors:
Hans Eberle;Nils Gura
Affiliations:
Sun Microsystems Laboratories, Mountain View, CA;Sun Microsystems Laboratories, Mountain View, CA
Venue:
Separated high-bandwidth and low-latency communication in the cluster interconnect clint
Year:
2002

Citing 11
Cited 0

PVM: a framework for parallel distributed computing

Concurrency: Practice and Experience
The network architecture of the Connection Machine CM-5 (extended abstract)

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Performance-Based Path Determination for Interprocessor Communication in Distributed Computing Systems

IEEE Transactions on Parallel and Distributed Systems
Design challenges of virtual networks: fast, general-purpose communication

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Virtual-channel flow control

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs

IEEE Parallel & Distributed Technology: Systems & Technology
The Scalable Coherent Interface and Related Standards Projects

IEEE Micro
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Virtual Network Transport Protocols for Myrinet

IEEE Micro
The Least Choice First Scheduling Method for High-Speed Network Switche

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Utilizing Heterogeneous Networks in Distributed Parallel Computing Systems

HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

An interconnect for a high-performance cluster has to be optimized in respect to both high throughput and low latency. To avoid the tradeoff between throughput and latency, the cluster interconnect Clint has a segregated architecture that provides two physically separate transmission channels: a bulk channel optimized for high-bandwidth traffic and a quick channel optimized for low-latency traffic. Different scheduling strategies are applied. The bulk channel uses a scheduler that globally allocates time slots on the transmission paths before packets are sent off. In this way, collisions as well as blockages are avoided. In contrast, the quick channel takes a best-effort approach by sending packets whenever they are available thereby risking collisions and retransmissions. Clint is targeted specifically at small- to medium-sized clusters offering a low-cost alternative to symmetric multiprocessor (SMP) systems. This design point allows for a simple and cost-effective implementation. In particular, by buffering packets only on the hosts and not requiring any buffer memory on the switches, protocols are simplified as switch forwarding delays are fixed, and throughput is optimized as the use of a global schedule is now possible.