An implementation of parallel file distribution in an agent hierarchy

  • Authors:
  • Munehiro Fukuda;Jumpei Miyauchi

  • Affiliations:
  • Computing & Software Systems, University of Washington, Bothell, USA 98011;Computer Science, Ehime University, Matsuyama, Japan 790-8577

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

PC grid is a cost-effective grid-computing platform that attracts users by allocating to their massively parallel applications as many desktop computers as requested. However, a challenge is how to distribute necessary files to remote computing nodes that may be unconnected to the same network file system, equipped with insufficient disk space to keep entire files, and even powered off asynchronously.Targeting PC grid, the AgentTeamwork grid-computing middleware deploys a hierarchy of mobile agents to remote desktops so as to launch, monitor, check-point, and resume a parallel and distributed computing job. To achieve high-speed file distribution, AgentTeamwork takes advantage of its agent hierarchy. The system partitions files into stripes at the tree root if they are random-access files, duplicates them at each tree level if they are shared among all remote nodes, fragments them into smaller messages if they are too large to relay to a lower tree level, aggregates such messages in a larger fragment if they are in transit to the same subtree, and returns output files to the user along multi-paths established within the tree. To achieve fault-tolerant file delivery, each agent periodically takes a snapshot of in-transit and on-memory file messages with its user job, and thus resumes them from the latest snapshot when they crash accidentally.This paper presents an implementation and its competitive performance of AgentTeamwork's file-distribution algorithm including file partitioning, transfer, check-pointing, and consistency maintenance.