Distributed parallel data storage systems: a scalable approach to high speed image servers
MULTIMEDIA '94 Proceedings of the second ACM international conference on Multimedia
The macroscopic behavior of the TCP congestion avoidance algorithm
ACM SIGCOMM Computer Communication Review
Using Disk Throughput Data in Predictions of End-to-End Grid Data Transfers
GRID '02 Proceedings of the Third International Workshop on Grid Computing
Managing Network Resources in Condor
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Using Regression Techniques to Predict Large Data Transfers
International Journal of High Performance Computing Applications
UDT: UDP-based data transfer for high-speed wide area networks
Computer Networks: The International Journal of Computer and Telecommunications Networking
CPU Service Classes for Multimedia Applications
ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
On-demand Overlay Networks for Large Scientific Data Transfers
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Just in time: adding value to the IO pipelines of high performance applications with JITStaging
Proceedings of the 20th international symposium on High performance distributed computing
IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum
Software as a service for data scientists
Communications of the ACM
The design and implementation of the KOALA co-allocating grid scheduler
EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
End-to-end quality of service for high-end applications
Computer Communications
Journal of Grid Computing
On using virtual circuits for GridFTP transfers
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
In prior work, we analyzed the GridFTP usage logs collected by data transfer nodes (DTNs) located at national scientific computing centers, and found significant throughput variance even among transfers between the same two end hosts. The goal of this work is to quantify the impact of various factors on throughput variance. Our methodology consisted of executing experiments on a high-speed research testbed, running large-sized instrumented transfers between operational DTNs, and creating statistical models from collected measurements. A non-linear regression model for memory-to-memory transfer throughput as a function of CPU usage at the two DTNs and packet loss rate was created. The model is useful for determining concomitant resource allocations to use in scheduling requests. For example, if a whole NERSC DTN CPU core can be assigned to the GridFTP process executing a large memory-to-memory transfer to SLAC, then only 32% of a CPU core is required at the SLAC DTN for the corresponding GridFTP process due to a difference in the computing speeds of these two DTNs. With these CPU allocations, data can be moved at 6.3 Gbps, which sets the rate to request from the circuit scheduler.