Safety scheduling strategies in distributed computing

Authors:
Victor V. Toporkov;Alexey Tselishchev
Affiliations:
Computer Science Department, Moscow Power Engineering Institute, ul. Krasnokazarmennaya 14, Moscow, 111250 Russia.;European Organization for Nuclear Research (CERN), 1211 Geneva, 23, Switzerland
Venue:
International Journal of Critical Computer-Based Systems
Year:
2010

Citing 18
Cited 3

A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
A Resource Management Architecture for Metacomputing Systems

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Economic Scheduling in Grid Computing

JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
An Integrated Approach to Parallel Scheduling Using Gang-Scheduling, Backfilling, and Migration

IEEE Transactions on Parallel and Distributed Systems
Resource Co-Allocation in Computational Grids

HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Scheduling in the Grid application development software project

Grid resource management
Improving resource selection and scheduling using predictions

Grid resource management
Multicriteria aspects of Grid resource management

Grid resource management
Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Composition and distribution of resources for real-time computing systems

Automation and Remote Control
Distributed computing in practice: the Condor experience: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
The Anatomy of the Grid: Enabling Scalable Virtual Organizations

International Journal of High Performance Computing Applications
The impact of data replication on job scheduling performance in the Data Grid

Future Generation Computer Systems
Scheduling mixed-parallel applications with advance reservations

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
CISNE: a new integral approach for scheduling parallel applications on non-dedicated clusters

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Multi-site scheduling with multiple job reservations and forecasting methods

ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Dynamic load balancing of black-box applications with a resource selection mechanism on heterogeneous resources of the grid

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies

Job control in distributed environments with non-dedicated resources

Journal of Computer and Systems Sciences International
Slot selection and co-allocation for economic scheduling in distributed computing

PaCT'11 Proceedings of the 11th international conference on Parallel computing technologies
Packet task processing in distributed computational environments with inalienable resources

Automation and Remote Control

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present an approach to safety scheduling in distributed computing based on strategies of resource co-allocation for complex sets of tasks (jobs). The necessity of guaranteed job execution until the time limits requires taking into account the distributed environment dynamics, namely, changes in the number of jobs for servicing, volumes of computations, possible failures of processor nodes, etc. As a consequence, in the general case, a set of versions of scheduling and resource co-allocation, or a strategy, is required instead of a single version. Safety strategies are formed for structurally different job models with various levels of task granularity and data replication policies. We develop and consider scheduling strategies which combine fine-grain and coarse-grain computations, multiple data replicas and constrained data movement. These strategies are evaluated using simulations studies and addressing a variety of metrics.