TakTuk, adaptive deployment of remote executions

  • Authors:
  • Benoit Claudel;Guillaume Huard;Olivier Richard

  • Affiliations:
  • INRIA Sardes research team - CNRS LIG Laboratory - Grenoble University, France, Grenoble, France;INRIA Moais research team - CNRS LIG Laboratory - Grenoble University, France, Grenoble, France;INRIA Mescal research team - CNRS LIG Laboratory - Grenoble University, France, Grenoble, France

  • Venue:
  • Proceedings of the 18th ACM international symposium on High performance distributed computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article deals with TakTuk, a middleware that deploys efficiently parallel remote executions on large scale grids (thousands of nodes). This tool is mostly intended for interactive use: distributed machines administration and parallel applications development. Thus, it has to minimize the time required to complete the whole deployment process. To achieve this minimization, we propose and validate a remote execution deployment model inspired by the real world behavior of standard remote execution protocols (rsh and ssh). From this model and from existing works in networking, we deduce an optimal deployment algorithm for the homogeneous case. Unfortunately, this optimal algorithm does not translate directly to the heterogeneous case. Therefore, we derive from the theoretical solution a heuristic based on dynamic work-stealing that adapts to heterogeneities (processors, links, load, ...). The underlying principle of this heuristic is the same as the principle of the optimal algorithm: to deploy nodes as soon as possible. Experiments assess TakTuk efficiency and show that TakTuk scales well to thousands of nodes. Compared to similar tools, TakTuk ranks among the best performers while offering more features and versatility. In particular, TakTuk is the only tool really suited to remote executions deployment on grids or more heterogeneous platforms.