High-Availability Computer Systems
Computer
Journal of Parallel and Distributed Computing
Dynamic resource allocation for shared data centers using online measurements
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Configurable isolation: building high availability systems with commodity multi-core processors
Proceedings of the 34th annual international symposium on Computer architecture
Proactive fault tolerance for HPC with Xen virtualization
Proceedings of the 21st annual international conference on Supercomputing
Dynamo: amazon's highly available key-value store
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Reliability-aware resource allocation in HPC systems
CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Combinatorial Optimization: Theory and Algorithms
Combinatorial Optimization: Theory and Algorithms
Leveraging virtualization to optimize high-availability system configurations
IBM Systems Journal
Independent tasks scheduling based on genetic algorithm in cloud computing
WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Resource use pattern analysis for predicting resource availability in opportunistic grids
Concurrency and Computation: Practice & Experience - Advanced Scheduling Strategies and Grid Programming Environments
An Approach to Optimized Resource Scheduling Algorithm for Open-Source Cloud Systems
CHINAGRID '10 Proceedings of the The Fifth Annual ChinaGrid Conference
Integrating Resource Consumption and Allocation for Infrastructure Resources on-Demand
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
FlexPRICE: Flexible Provisioning of Resources in a Cloud Environment
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Cost-Optimal Scheduling in Hybrid IaaS Clouds for Deadline Constrained Workloads
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
Forecasting for Grid and Cloud Computing On-Demand Resources Based on Pattern Matching
CLOUDCOM '10 Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science
Population-based metaheuristics for tasks scheduling in heterogeneous distributed systems
NMA'10 Proceedings of the 7th international conference on Numerical methods and applications
Empirical prediction models for adaptive resource provisioning in the cloud
Future Generation Computer Systems
Editorial: The management of cloud systems
Future Generation Computer Systems
Hi-index | 0.00 |
Cloud computing is becoming a popular solution for storing data and executing applications due to its on-demand pay-per-use policy that allows access to virtually unlimited resources. In this frame applications such as those oriented towards Web 2.0 begin to be migrated on cloud systems. Web 2.0 applications are usually composed of several components that run indefinitely and need to be available to end users throughout their execution life cycle. Their availability strongly depends on the number of resource failures and on the variation in user hit rate. These problems are usually solved through scaling. A scaled application can span its components on several nodes. Hence if one or more nodes fail it could become unavailable. Therefore we require a method of ensuring the application's functionality despite the number of node failures. In this paper we propose to build highly available applications, i.e., systems with low downtimes, by taking advantage of the component based architecture and of the application scaling property. We present a solution to finding the optimal number of component types needed on nodes so that every type is present on every allocated node. Furthermore nodes cannot exceed a maximum threshold and the total running cost of the applications needs to be minimized. A sub-optimal solution is also given. Both solutions rely on genetic algorithms to achieve their goals. The efficiency of the sub-optimal algorithm is studied with respect to its success rate, i.e., probability of the schedule to provide highly available applications in case all but one node fail. Tests performed on the sub-optimal algorithm in terms of node load, closeness to the optimal solution and success rate prove the algorithm's efficiency.