IEEE Transactions on Parallel and Distributed Systems
The ANL/IBM SP Scheduling System
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Theory and Practice in Parallel Job Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Selective Reservation Strategies for Backfill Job Scheduling
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
A Model For Speedup of Parallel Programs
A Model For Speedup of Parallel Programs
The workload on parallel supercomputers: modeling the characteristics of rigid jobs
Journal of Parallel and Distributed Computing
Dynamic Scheduling of Parallel Jobs with QoS Demands in Multiclusters and Grids
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Why do internet services fail, and what can be done about it?
USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Backfilling Using System-Generated Predictions Rather than User Runtime Estimates
IEEE Transactions on Parallel and Distributed Systems
Fair Load-Balancing on Parallel Systems for QoS
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
Inter-operating grids through delegated matchmaking
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Amazon S3 for science grids: a viable solution?
DADC '08 Proceedings of the 2008 international workshop on Data-aware distributed computing
InterGrid: a case for internetworking islands of Grids
Concurrency and Computation: Practice & Experience
The cost of doing science on the cloud: the Montage example
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A dynamic admission control scheme to manage contention on shared computing resources
Concurrency and Computation: Practice & Experience
Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters
Proceedings of the 18th ACM international symposium on High performance distributed computing
The Eucalyptus Open-Source Cloud-Computing System
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Cost-benefit analysis of Cloud Computing versus desktop grids
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Cloud Security and Privacy: An Enterprise Perspective on Risks and Compliance
Cloud Security and Privacy: An Enterprise Perspective on Risks and Compliance
Virtual Infrastructure Management in Private and Hybrid Clouds
IEEE Internet Computing
Harnessing Cloud Technologies for a Virtualized Distributed Computing Infrastructure
IEEE Internet Computing
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
A grid workflow environment for brain imaging analysis on distributed systems
Concurrency and Computation: Practice & Experience - Special Issue: 3rd International Workshop on Workflow Management and Applications in Grid Environments (WaGe2008)
Determining Service Trustworthiness in Intercloud Computing Environments
ISPAN '09 Proceedings of the 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks
The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Elastic Site: Using Clouds to Elastically Extend Site Resources
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Quantifying event correlations for proactive failure management in networked computing systems
Journal of Parallel and Distributed Computing
A flexible checkpoint/restart model in distributed systems
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
A model for space-correlated failures in large-scale distributed systems
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Managing Peak Loads by Leasing Cloud Infrastructure Services from a Spot Market
HPCC '10 Proceedings of the 2010 IEEE 12th International Conference on High Performance Computing and Communications
Availability in globally distributed storage systems
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
IEEE Transactions on Parallel and Distributed Systems
Making wide-area, multi-site MPI feasible using xen VM
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Evaluation of gang scheduling performance and cost in a cloud computing system
The Journal of Supercomputing
Workload characteristics of a multi-cluster supercomputer
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Online cost-efficient scheduling of deadline-constrained workloads on hybrid clouds
Future Generation Computer Systems
Journal of Parallel and Distributed Computing
A job submission manager for large-scale distributed systems based on job futurity predictor
International Journal of Grid and Utility Computing
Hi-index | 0.00 |
Hybrid Cloud computing is receiving increasing attention in recent days. In order to realize the full potential of the hybrid Cloud platform, an architectural framework for efficiently coupling public and private Clouds is necessary. As resource failures due to the increasing functionality and complexity of hybrid Cloud computing are inevitable, a failure-aware resource provisioning algorithm that is capable of attending to the end-users quality of service (QoS) requirements is paramount. In this paper, we propose a scalable hybrid Cloud infrastructure as well as resource provisioning policies to assure QoS targets of the users. The proposed policies take into account the workload model and the failure correlations to redirect users' requests to the appropriate Cloud providers. Using real failure traces and a workload model, we evaluate the proposed resource provisioning policies to demonstrate their performance, cost as well as performance-cost efficiency. Simulation results reveal that in a realistic working condition while adopting user estimates for the requests in the provisioning policies, we are able to improve the users' QoS about 32% in terms of deadline violation rate and 57% in terms of slowdown with a limited cost on a public Cloud.