Communication Contention in Task Scheduling
IEEE Transactions on Parallel and Distributed Systems
Task Scheduling for Parallel Systems (Wiley Series on Parallel and Distributed Computing)
Task Scheduling for Parallel Systems (Wiley Series on Parallel and Distributed Computing)
IEEE Transactions on Parallel and Distributed Systems
SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems
Middleware '08 Proceedings of the ACM/IFIP/USENIX 9th International Middleware Conference
Characterizing cloud computing hardware reliability
Proceedings of the 1st ACM symposium on Cloud computing
Cost-Optimal Scheduling in Hybrid IaaS Clouds for Deadline Constrained Workloads
CLOUD '10 Proceedings of the 2010 IEEE 3rd International Conference on Cloud Computing
A Hybrid Approach to High Availability in Stream Processing Systems
ICDCS '10 Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems
Cost-Conscious Scheduling for Large Graph Processing in the Cloud
HPCC '11 Proceedings of the 2011 IEEE International Conference on High Performance Computing and Communications
Cost-Efficient Scheduling Heuristics for Deadline Constrained Workloads on Hybrid Clouds
CLOUDCOM '11 Proceedings of the 2011 IEEE Third International Conference on Cloud Computing Technology and Science
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Hi-index | 0.00 |
Nowadays, thousands of servers in a cloud datacenter coordinate tasks to provide more reliable and highly available cloud computing services, especially in multi-task processing, as a crucial step to achieve high performance. Therefore, we need effective mechanisms to prepare for a failure of computing nodes. So far, a number of research studies have been carried out, trying to eliminate these problems, yet a little has been found efficient. In this paper, we present a cost-bandwidth based on scheduling algorithm that makes recovery from a saved state faster on heterogeneous computing environments. This algorithm not only considers the network bandwidth but also looks carefully at the monetary cost, which is paid by cloud customers (CCs) for utilizing cloud resources. In order to justify our proposal, we conducted numerous simulations and compared our method with existing ones. The results show that our approach can achieve higher performance, including recovery time in case of failure, while overhead in the case of no failure is a little in typical scenarios.