Scheduling highly available applications on cloud environments

Authors:
Marc Eduard Fríncu
Affiliations:
-
Venue:
Future Generation Computer Systems
Year:
2014

Citing 19
Cited 1

High-Availability Computer Systems

Computer
A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems

Journal of Parallel and Distributed Computing
Dynamic resource allocation for shared data centers using online measurements

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Configurable isolation: building high availability systems with commodity multi-core processors

Proceedings of the 34th annual international symposium on Computer architecture
Proactive fault tolerance for HPC with Xen virtualization

Proceedings of the 21st annual international conference on Supercomputing
Dynamo: amazon's highly available key-value store

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Reliability-aware resource allocation in HPC systems

CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Combinatorial Optimization: Theory and Algorithms

Combinatorial Optimization: Theory and Algorithms
Leveraging virtualization to optimize high-availability system configurations

IBM Systems Journal
Independent tasks scheduling based on genetic algorithm in cloud computing

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Resource use pattern analysis for predicting resource availability in opportunistic grids

Concurrency and Computation: Practice & Experience - Advanced Scheduling Strategies and Grid Programming Environments
An Approach to Optimized Resource Scheduling Algorithm for Open-Source Cloud Systems

CHINAGRID '10 Proceedings of the The Fifth Annual ChinaGrid Conference
Integrating Resource Consumption and Allocation for Infrastructure Resources on-Demand

CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
FlexPRICE: Flexible Provisioning of Resources in a Cloud Environment

CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Dynamic Resource Allocation in Computing Clouds Using Distributed Multiple Criteria Decision Analysis

CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Cost-Optimal Scheduling in Hybrid IaaS Clouds for Deadline Constrained Workloads

CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Forecasting for Grid and Cloud Computing On-Demand Resources Based on Pattern Matching

CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Population-based metaheuristics for tasks scheduling in heterogeneous distributed systems

NMA'10 Proceedings of the 7th international conference on Numerical methods and applications
Empirical prediction models for adaptive resource provisioning in the cloud

Future Generation Computer Systems

Editorial: The management of cloud systems

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cloud computing is becoming a popular solution for storing data and executing applications due to its on-demand pay-per-use policy that allows access to virtually unlimited resources. In this frame applications such as those oriented towards Web 2.0 begin to be migrated on cloud systems. Web 2.0 applications are usually composed of several components that run indefinitely and need to be available to end users throughout their execution life cycle. Their availability strongly depends on the number of resource failures and on the variation in user hit rate. These problems are usually solved through scaling. A scaled application can span its components on several nodes. Hence if one or more nodes fail it could become unavailable. Therefore we require a method of ensuring the application's functionality despite the number of node failures. In this paper we propose to build highly available applications, i.e., systems with low downtimes, by taking advantage of the component based architecture and of the application scaling property. We present a solution to finding the optimal number of component types needed on nodes so that every type is present on every allocated node. Furthermore nodes cannot exceed a maximum threshold and the total running cost of the applications needs to be minimized. A sub-optimal solution is also given. Both solutions rely on genetic algorithms to achieve their goals. The efficiency of the sub-optimal algorithm is studied with respect to its success rate, i.e., probability of the schedule to provide highly available applications in case all but one node fail. Tests performed on the sub-optimal algorithm in terms of node load, closeness to the optimal solution and success rate prove the algorithm's efficiency.