IEEE Transactions on Parallel and Distributed Systems
Collecting Unused Processing Capacity: An Analysis of Transient Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
The ANL/IBM SP Scheduling System
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Theory and Practice in Parallel Job Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Selective Reservation Strategies for Backfill Job Scheduling
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
Optimal probabilistic routing in distributed parallel queues
ACM SIGMETRICS Performance Evaluation Review
Contract-based load management in federated distributed systems
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Sharing networked resources with brokered leases
ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
VTDC '06 Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing
Inter-operating grids through delegated matchmaking
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Future Generation Computer Systems
Amazon S3 for science grids: a viable solution?
DADC '08 Proceedings of the 2008 international workshop on Data-aware distributed computing
The cost of doing science on the cloud: the Montage example
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters
Proceedings of the 18th ACM international symposium on High performance distributed computing
The Eucalyptus Open-Source Cloud-Computing System
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Cost-benefit analysis of Cloud Computing versus desktop grids
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Virtual Infrastructure Management in Private and Hybrid Clouds
IEEE Internet Computing
Harnessing Cloud Technologies for a Virtualized Distributed Computing Infrastructure
IEEE Internet Computing
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
A grid workflow environment for brain imaging analysis on distributed systems
Concurrency and Computation: Practice & Experience - Special Issue: 3rd International Workshop on Workflow Management and Applications in Grid Environments (WaGe2008)
Prospects of collaboration between compute providers by means of job interchange
JSSPP'07 Proceedings of the 13th international conference on Job scheduling strategies for parallel processing
The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Elastic Site: Using Clouds to Elastically Extend Site Resources
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
A flexible checkpoint/restart model in distributed systems
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Making wide-area, multi-site MPI feasible using xen VM
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Evaluation of gang scheduling performance and cost in a cloud computing system
The Journal of Supercomputing
Workload characteristics of a multi-cluster supercomputer
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Hi-index | 0.00 |
In this paper, we investigate Cloud computing resource provisioning to extend the computing capacity of local clusters in the presence of failures. We consider three steps in the resource provisioning including resource brokering, dispatch sequences, and scheduling. The proposed brokering strategy is based on the stochastic analysis of routing in distributed parallel queues and takes into account the response time of the Cloud provider and the local cluster while considering computing cost of both sides. Moreover, we propose dispatching with probabilistic and deterministic sequences to redirect requests to the resource providers. We also incorporate checkpointing in some well-known scheduling algorithms to provide a fault-tolerant environment. We propose two cost-aware and failure-aware provisioning policies that can be utilized by an organization that operates a cluster managed by virtual machine technology, and seeks to use resources from a public Cloud provider. Simulation results demonstrate that the proposed policies improve the response time of users' requests by a factor of 4.10 under a moderate load with a limited cost on a public Cloud.