Predicting the Performance of Wide Area Data Transfers

Authors:
Sudharshan Vazhkudai;Jennifer M. Schopf;Ian T. Foster
Affiliations:
-;-;-
Venue:
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Year:
2002

Citing 27
Cited 36

Analytic Queueing Network Models for Parallel Processing of Task Systems

IEEE Transactions on Computers
Algorithmic skeletons: structured management of parallel computation

Algorithmic skeletons: structured management of parallel computation
Digital signal processing: theory, applications, and hardware

Digital signal processing: theory, applications, and hardware
Analytical performance prediction on multicomputers

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Analyzing the behavior and performance of parallel programs

Analyzing the behavior and performance of parallel programs
Performance prediction and tuning of parallel programs

Performance prediction and tuning of parallel programs
Exploiting process lifetime distributions for dynamic load balancing

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
LDAP: programming directory-enabled applications with lightweight directory access protocol

LDAP: programming directory-enabled applications with lightweight directory access protocol
Adaptive performance prediction for distributed data-intensive applications

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
High-performance remote access to climate simulation data: a challenge problem for data grid technologies

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Dynamically forecasting network performance using the Network Weather Service

Cluster Computing
Host load prediction using linear models

Cluster Computing
Predicting Performance of Parallel Computations

IEEE Transactions on Parallel and Distributed Systems
Data Management in an International Data Grid Project

GRID '00 Proceedings of the First IEEE/ACM International Workshop on Grid Computing
Predicting Queue Times on Space-Sharing Parallel Computers

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Predicting Application Run Times Using Historical Information

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Replica Selection in the Globus Data Grid

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
The Globus Project: A Status Report

HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Customized dynamic load balancing for a network of workstations

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
A Distributed Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing

MSS '01 Proceedings of the Eighteenth IEEE Symposium on Mass Storage Systems and Technologies
The War Between Mice and Elephants

The War Between Mice and Elephants
On Class-based Isolation of UDP, Short-lived and Long-lived TCP Flows

On Class-based Isolation of UDP, Short-lived and Long-lived TCP Flows
Optimizing TCP Start-up Performance

Optimizing TCP Start-up Performance
Grid Information Services for Distributed Resource Sharing

HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Distributed Data Access and Resource Management in the D0 SAM System

HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Performance Prediction in Production Environments

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium

Using Disk Throughput Data in Predictions of End-to-End Grid Data Transfers

GRID '02 Proceedings of the Third International Workshop on Grid Computing
Multivariate resource performance forecasting in the network weather service

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Experiences with predicting resource performance on-line in computational grid settings

ACM SIGMETRICS Performance Evaluation Review
Predicting Sporadic Grid Data Transfers

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Enabling the Co-Allocation of Grid Data Transfers

GRID '03 Proceedings of the 4th International Workshop on Grid Computing
Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
On the predictability of large transfer TCP throughput

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Using Regression Techniques to Predict Large Data Transfers

International Journal of High Performance Computing Applications
What is worth learning from parallel workloads?: a user and session based analysis

Proceedings of the 19th annual international conference on Supercomputing
A framework for reliable and efficient data placement in distributed computing systems

Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Experiment and analysis for QoS of E-commerce systems

Journal of Theoretical and Applied Electronic Commerce Research
Improvements on dynamic adjustment mechanism in co-allocation data grid environments

The Journal of Supercomputing
A machine learning approach to TCP throughput prediction

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
On the predictability of large transfer TCP throughput

Computer Networks: The International Journal of Computer and Telecommunications Networking
Algorithms for Integrated Routing and Scheduling for Aggregating Data from Distributed Resources on a Lambda Grid

IEEE Transactions on Parallel and Distributed Systems
Performance of a GridFTP overlay network

Future Generation Computer Systems
Timely offloading of result-data in HPC centers

Proceedings of the 22nd annual international conference on Supercomputing
Efficient reuse of replicated parallel data segments in computational grids

Future Generation Computer Systems
Replica selection strategies in data grid

Journal of Parallel and Distributed Computing
A GridFTP Overlay Network Service

GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Enhancement of anticipative recursively adjusting mechanism for redundant parallel file transfer in data grids

Journal of Network and Computer Applications
A Recursively-Adjusting Co-allocation scheme with a Cyber-Transformer in Data Grids

Future Generation Computer Systems
Redundant parallel file transfer with anticipative recursively-adjusting scheme in data grids

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
A fair replica placement for parallel download on cluster grid

NBiS'07 Proceedings of the 1st international conference on Network-based information systems
A dynamic adjustment strategy for file transformation in data grids

NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
Implementation of a medical image file accessing system in co-allocation data grids

Future Generation Computer Systems
Implementation of a dynamic adjustment strategy for parallel file transfer in co-allocation data grids

The Journal of Supercomputing
A machine learning approach to TCP throughput prediction

IEEE/ACM Transactions on Networking (TON)
A recursive-adjustment co-allocation scheme in data grid environments

ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
Predicting network throughput for grid applications on network virtualization areas

Proceedings of the first international workshop on Network-aware data management
Costs and benefits of load sharing in the computational grid

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Replica selection on co-allocation data grids

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Enhancing security of real-time applications on grids through dynamic scheduling

JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
Using classification techniques to improve replica selection in data grid

ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part II
A two phased service oriented Broker for replica selection in data grids

Future Generation Computer Systems
A Study on the Effect of Application and Resource Characteristics on the QoS in Service Provisioning Environments

International Journal of Distributed Systems and Technologies

Quantified Score

Hi-index	0.01

Visualization

Abstract

As Data Grids become more commonplace, large data sets are being replicated and distributed to multiple sites, leading to the problem of determining which replica can be accessed most efficiently. The answer to this question can depend on many factors, including physical characteristics of the resources and the load behavior on the CPUs, networks, and storage devices that are part of the end-to-end path linking possible sources and sinks.We develop a predictive framework that combines (1) integrated instrumentation that collects information about the end-to-end performance of past transfers, (2) predictors to estimate future transfer times, and (3) a data delivery infrastructure that provides users with access to both the raw data and our predictions. We evaluate the performance of our predictors by applying them to log data collected from a wide area testbed. These preliminary results provide insights into the effectiveness of using predictors in this situation.