An Efficient Implementation of Edmonds' Algorithm for Maximum Matching on Graphs
Journal of the ACM (JACM)
Heuristic Algorithms for Scheduling Independent Tasks on Nonidentical Processors
Journal of the ACM (JACM)
IEEE Computational Science & Engineering
A Performance Prediction Framework for Data Intensive Applications on Large Scale Parallel Machines
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Mathematical Programming: Series A and B
The Globus Striped GridFTP Framework and Server
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A hypergraph partitioning based approach for scheduling of tasks with batch-shared I/O
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Efficient reuse of replicated parallel data segments in computational grids
Future Generation Computer Systems
File grouping for scientific data management: lessons from experimenting with real traces
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Using overlays for efficient data transfer over shared wide-area networks
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Hi-index | 0.00 |
This paper addresses the problem of efficient collective scheduling of file transfers requested by a batch of tasks. Our work targets a heterogeneous collection of storage and compute clusters. The goal is to minimize the overall time to transfer files to their respective destination nodes. Two scheduling schemes are proposed and experimentally evaluated against an existing approach, the Insertion Scheduling. The first is a 0-1 Integer Programming based approach which is based on the idea of time-expanded networks. This scheme achieves the minimum total file transfer time, but has significant scheduling overhead. To address this issue, we propose a maximum weight graph matching based heuristic approach. This scheme is able to perform as well as insertion scheduling and has much lower scheduling overhead. We conclude that the heuristic scheme is a better fit for larger workloads and systems.