Task Allocation for Maximizing Reliability of Distributed Computer Systems
IEEE Transactions on Computers
The interaction of parallel and sequential workloads on a network of workstations
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Future Generation Computer Systems - Special issue on metacomputing
Journal of Parallel and Distributed Computing
GridRM: A Resource Monitoring Architecture for the Grid
GRID '02 Proceedings of the Third International Workshop on Grid Computing
ISCC '03 Proceedings of the Eighth IEEE International Symposium on Computers and Communications
A decoupled scheduling approach for Grid application development environments
Journal of Parallel and Distributed Computing - Special issue on computational grids
Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Automatic methods for predicting machine availability in desktop Grid and peer-to-peer systems
CCGRID '04 Proceedings of the 2004 IEEE International Symposium on Cluster Computing and the Grid
Empirical Studies on the Behavior of Resource Availability in Fine-Grained Cycle Sharing Systems
ICPP '06 Proceedings of the 2006 International Conference on Parallel Processing
Exploiting availability prediction in distributed systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Grid Resource Scheduling with Gossiping Protocols
P2P '07 Proceedings of the Seventh IEEE International Conference on Peer-to-Peer Computing
Performance Evaluation of Scheduling Policies for Volunteer Computing
E-SCIENCE '07 Proceedings of the Third IEEE International Conference on e-Science and Grid Computing
Multi-state grid resource availability characterization
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
FALCON: a system for reliable checkpoint recovery in shared grid environments
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Availability Prediction Based Replication Strategies for Grid Environments
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Toward high performance computing in unconventional computing environments
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Business process scheduling with resource availability constraints
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems - Volume Part I
Trust and reputation based association among grid entities
SEPADS'12/EDUCATION'12 Proceedings of the 11th WSEAS international conference on Software Engineering, Parallel and Distributed Systems, and proceedings of the 9th WSEAS international conference on Engineering Education
Hi-index | 0.00 |
To make the most effective application placement decisions on volatile large-scale heterogeneous Grids, schedulers must consider factors such as resource speed, load, and reliability. Including reliability requires availability predictors, which consider different periods of resource history, and use various strategies to make predictions about resource behavior. Prediction accuracy significantly affects the quality of the schedule, as does the method by which schedulers combine various factors, including the weight given to predicted availability, speed, load, and more. This paper explores the question of how to consider predicted availability to improve scheduling, concentrating on multi-state availability predictors. We propose and study several classes of schedulers, and a method for combining factors. We characterize the inherent tradeoff between application makespan and the number of evictions due to failure, and demonstrate how our schedulers can navigate this tradeoff under various scenarios. We vary application load and length, and the percentage of jobs that are checkpointable. Our results show that the only other multi-state prediction based scheduler causes up to 51% more evicted jobs while simultaneously increasing average job makespan by 18% when compared with our scheduler.