A data throughput prediction and optimization service for widely distributed many-task computing

Authors:
Dengpan Yin;Esma Yildirim;Tevfik Kosar
Affiliations:
Louisiana State University, Baton Rouge, LA;Louisiana State University, Baton Rouge, LA;Louisiana State University, Baton Rouge, LA
Venue:
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
Year:
2009

Citing 10
Cited 1

Differentiated end-to-end Internet services using a weighted proportional fair sharing TCP

ACM SIGCOMM Computer Communication Review
PSockets: the case for application-level network striping for data intensive applications using high speed wide area networks

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Effects of ensemble-TCP

ACM SIGCOMM Computer Communication Review
The End-to-End Performance Effects of Parallel TCP Sockets on a Lossy Wide-Area Network

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Scalable TCP: improving performance in highspeed wide area networks

ACM SIGCOMM Computer Communication Review
Modeling and Taming Parallel TCP on the Wide Area Network

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Characterizing and Predicting TCP Throughput on the Wide Area Network

ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
Target bandwidth sharing using endhost measures

Performance Evaluation
Adaptive data block scheduling for parallel TCP streams

HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
FAST TCP: from theory to experiments

IEEE Network: The Magazine of Global Internetworking

Predicting network throughput for grid applications on network virtualization areas

Proceedings of the first international workshop on Network-aware data management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present the design and implementation of a network throughput prediction and optimization service for many-task computing in widely distributed environments. This service uses multiple parallel TCP streams to improve the end-to-end throughput of data transfers. A novel mathematical model is used to decide the number of parallel streams to achieve best performance. This model can predict the optimal number of parallel streams with as few as three prediction points. We implement this new service in the Stork data scheduler, where the prediction points can be obtained using Iperf and GridFTP samplings. Our results show that the prediction cost plus the optimized transfer time is much less than the unoptimized transfer time in most cases.