Cluster-wide context switch of virtualized jobs

Authors:
Fabien Hermenier;Adrien Lèbre;Jean-Marc Menaud
Affiliations:
ASCOLA Research Group, Mines de Nantes, INRIA, LINA UMR;ASCOLA Research Group, Mines de Nantes, INRIA, LINA UMR;ASCOLA Research Group, Mines de Nantes, INRIA, LINA UMR
Venue:
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Year:
2010

Citing 21
Cited 2

Formal requirements for virtualizable third generation architectures

Communications of the ACM
Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling

IEEE Transactions on Parallel and Distributed Systems
NAS Grid Benchmarks: A Tool for Grid Space Exploration

Cluster Computing
The ANL/IBM SP Scheduling System

IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
The EASY - LoadLeveler API Project

IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Theory and Practice in Parallel Job Scheduling

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Dynamic Virtual Clusters in a Grid Site Manager

HPDC '03 Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing
Xen and the art of virtualization

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Distributed computing in practice: the Condor experience: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
OpenMosix, OpenSSI and Kerrighed: a comparative study

CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Future and Trends in Constraint Programming

Future and Trends in Constraint Programming
Live migration of virtual machines

NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Sharing networked resources with brokered leases

ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Xen and the Art of Cluster Scheduling

VTDC '06 Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing
Virtual Machine Hosting for Networked Clusters: Building the Foundations for "Autonomic" Orchestration

VTDC '06 Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing
Power-aware dynamic placement of HPC applications

Proceedings of the 22nd annual international conference on Supercomputing
Combining batch execution and leasing using virtual machines

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Autonomic Live Adaptation of Virtual Computational Environments in a Multi-Domain Infrastructure

ICAC '06 Proceedings of the 2006 IEEE International Conference on Autonomic Computing
Memory buddies: exploiting page sharing for smart colocation in virtualized data centers

Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Entropy: a consolidation manager for clusters

Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Enabling and optimizing pilot jobs using xen based virtual machines for the HPC grid applications

VTDC '09 Proceedings of the 3rd international workshop on Virtualization technologies in distributed computing

DISCOVERY, beyond the clouds

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Cooperative dynamic scheduling of virtual machines in distributed systems

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clusters are mostly used through Resources Management Systems (RMS) with a static allocation of resources for a bounded amount of time. Those approaches are known to be insufficient for an efficient use of clusters. To provide a finer RMS, job preemption, migration and dynamic allocation of resources are required. However due to the complexity of developing and using such mechanisms, advanced scheduling strategies have rarely been deployed. This trend is currently evolving thanks to the use of migration and preemption capabilities of Virtual Machines (VMs). However, although the manipulation of jobs composed of VM enables to change the state of the jobs according to the scheduling objective, changing the state and the location of numerous VMs at each decision is tedious and degrades the overall performance. In addition to the scheduling policy implementation, developers have to focus on the feasibility of the actions while executing them in the most efficient way. In this paper, we argue such an operation is independent from the policy itself and can be addressed through a generic mechanism, the cluster-wide context switch. Thanks to it, developers can implement sophisticated algorithms to schedule jobs without handling the issues related to their manipulations. They only focus on the implementation of their algorithm to select the jobs to run while the cluster-wide context switch system performs the necessary actions to switch from the current to the new situation. As a proof of concept, we evaluate the interest of the cluster-wide context switch through a sample scheduler that executes jobs as early as possible, even partially, regarding to their current resources requirements and their priority.