Analytic Queueing Network Models for Parallel Processing of Task Systems
IEEE Transactions on Computers
Algorithmic skeletons: structured management of parallel computation
Algorithmic skeletons: structured management of parallel computation
Analytical performance prediction on multicomputers
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Analyzing the behavior and performance of parallel programs
Analyzing the behavior and performance of parallel programs
Performance prediction and tuning of parallel programs
Performance prediction and tuning of parallel programs
Adaptive performance prediction for distributed data-intensive applications
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Performance coupling: case studies for measuring the interactions of kernels in modern applications
Performance evaluation and benchmarking with realistic applications
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Predicting Performance of Parallel Computations
IEEE Transactions on Parallel and Distributed Systems
Data Management in an International Data Grid Project
GRID '00 Proceedings of the First IEEE/ACM International Workshop on Grid Computing
Predicting the Performance of Wide Area Data Transfers
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Predicting Queue Times on Space-Sharing Parallel Computers
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Predicting Application Run Times Using Historical Information
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Multivariate resource performance forecasting in the network weather service
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
The SDSC storage resource broker
CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
The Globus Project: A Status Report
HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Customized dynamic load balancing for a network of workstations
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Predicting Sporadic Grid Data Transfers
HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Rules of Thumb in Data Engineering
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Performance Prediction in Production Environments
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Replica selection in grid environment: a data-mining approach
Proceedings of the 2005 ACM symposium on Applied computing
GRID '05 Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing
Proceedings of the 15th ACM Mardi Gras conference: From lightweight mash-ups to lambda grids: Understanding the spectrum of distributed computing requirements, applications, tools, infrastructures, interoperability, and the incremental adoption of key capabilities
Dynamic load balancing for I/O-intensive applications on clusters
ACM Transactions on Storage (TOS)
Energy aware scheduling on desktop grid environment with static performance prediction
SpringSim '09 Proceedings of the 2009 Spring Simulation Multiconference
Co-allocation in data grids: a global, multi-user perspective
GPC'08 Proceedings of the 3rd international conference on Advances in grid and pervasive computing
Information-Knowledge-Systems Management
Taming massive distributed datasets: data sampling using bitmap indices
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
On causes of GridFTP transfer throughput variance
NDM '13 Proceedings of the Third International Workshop on Network-Aware Data Management
Information-Knowledge-Systems Management
Information-Knowledge-Systems Management
Hi-index | 0.01 |
Data grids provide an environment for communities of researchers to share, replicate, and manage access to copies of large datasets. In such environments, fetching data from one of the several replica locations requires accurate predictions of end-to-end transfer times. Predicting transfer time is significantly complicated because of the involvement of several shared components, including networks and disks in the end-to-end data path, each of which experiences load variations that can significantly affect the throughput. Of these, disk accesses are rapidly growing in cost and have not been previously considered, although on some machines they can be up to 30% of the transfer time. In this paper, we present techniques to combine observations of end-to-end application behavior and disk I/O throughput load data. We develop a set of regression models to derive predictions that characterize the effect of disk load variations on file transfer times. We also include network component variations and apply these techniques to the logs of transfer data using the GridFTP server, part of the Globus Toolkit驴. We observe up to 9% improvement in prediction accuracy when compared with approaches based on past system behavior in isolation.