A Feedback Mechanism for Network Scheduling in LambdaGrids

Authors:
Pallab Datta;Sushant Sharma;Wu-Chun Feng
Affiliations:
Los Alamos National Laboratory, USA;Los Alamos National Laboratory, USA;Virginia Tech, USA
Venue:
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Year:
2006

Citing 0
Cited 2

End-system aware, rate-adaptive protocol for network transport in LambdaGrid environments

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Making a case for proactive flow control in optical circuit-switched networks

HiPC'08 Proceedings of the 15th international conference on High performance computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Next-generation e-Science applications will require the ability to transfer information at high data rates between distributed computing centers and data repositories. A Lambda- Grid offers dedicated, optical, circuit-switched, point-topoint connections, which may be reserved exclusively for an application. Though such dedicated high-speed connections eliminate congestion in the network, they effectively push the network congestion out to the end systems, as processing speeds have not kept up with networking speeds. Therefore, developing an efficient transport protocol over such highspeed dedicated circuits is of critical importance. In this work, we propose the idea of a lightweight endsystem protocol, based on performance monitoring, to significantly improve the performance of data transport over a LambdaGrid. In particular, we focus on dynamically monitoring the OS task scheduling at the receiving end-system so that potential end-system congestion may be detected early and appropriate feedback can be transmitted back to the sending end-system to avoid packet losses. One example of such an evasive action is to suspend transmission for a certain duration of time during which the OS on the receiving end-system must handle other computational processes. With this in mind, we propose to extend the Reliable-Blast UDP (RBUDP) protocol to take such evasive action by using a simple feedback mechanism that is activated via performance monitoring. The new protocol, named RBUDP+ dramatically improves the performance of data transfer over LambdaGrids. We demonstrate the effectiveness of our proposed protocol and illustrate the performance gains achieved via network emulation