PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
IEEE Transactions on Parallel and Distributed Systems
Design challenges of virtual networks: fast, general-purpose communication
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs
IEEE Parallel & Distributed Technology: Systems & Technology
The Least Choice First Scheduling Method for High-Speed Network Switche
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Utilizing Heterogeneous Networks in Distributed Parallel Computing Systems
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Active Message Applications Programming Interface
Active Message Applications Programming Interface
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
The least choice first (LCF) scheduling method for high-speed network switches
The least choice first (LCF) scheduling method for high-speed network switches
Performance analysis of an optical circuit switched network for peta-scale systems
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
An interconnect for a high-performance cluster has to be optimized in respect to both high throughput and low latency. To avoid the tradeoff between throughput and latency, the cluster interconnect Clint has a segregated architecture that provides two physically separate transmission channels: A bulk channel optimized for high-bandwidth traffic and a quick channel optimized for low-latency traffic. Different scheduling strategies are applied. The bulk channel uses a scheduler that globally allocates time slots on the transmission paths before packets are sent off. This way collisions as well as blockages are avoided. In contrast, the quick channel takes a best-effort approach by sending packets whenever they are available thereby risking collisions and retransmissions.Simulation results clearly show the performance advantages of the segregated architecture. The carefully scheduled bulk channel can be loaded nearly to its full capacity without exhibiting head-of-line blocking that limits many networks while the quick channel provides low-latency communication even in the presence of high-bandwidth traffic.