Distributed Scheduling of Parallel I/O in the Presence of Data Replication

  • Authors:
  • Jan-Jan Wu;Pangfeng Liu

  • Affiliations:
  • Academia Sinica, Taipei, Taiwan;National Taiwan University, Taipei, Taiwan

  • Venue:
  • IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper studies distributed scheduling of parallel I/O data transfers on systems that provide data replication. In our previous work, we proposed a centralized algorithm for solving this problem in systems where data transfer information is centrally available. This algorithm finds the optimal scheduling by constructing augmenting paths in the data transfer bipartite graph, requiring O(nmlog n + {\text{n}}^{\text{2}} {\text{log}}^{\frac{3}{2}} n) time, with n nodes and m edges in the bipartite graph. In this paper, we investigate this scheduling problem in distributed systems where data transfer information may not be centrally available. We propose a distributed scheduling algorithm, Highest Degree Lowest Workload First (HDLWF), which approximates the augmenting path algorithm in distributed environments. HDLWF is based on a distributed, two-step scheme that determines appropriate execution order of data requests through a small number of rounds of bidding between clients and I/O servers. Our experimental results indicate that HDLWF yields schedules close to the centralized optimal solution, and in some cases within 3% of the optimal solution.