IEEE Transactions on Parallel and Distributed Systems
Improving cluster availability using workstation validation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The EASY - LoadLeveler API Project
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Theory and Practice in Parallel Job Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Metrics and Benchmarking for Parallel Job Scheduling
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Benchmarks and Standards for the Evaluation of Parallel Job Schedulers
IPPS/SPDP '99/JSSPP '99 Proceedings of the Job Scheduling Strategies for Parallel Processing
On the Design and Evaluation of Job Scheduling Algorithms
IPPS/SPDP '99/JSSPP '99 Proceedings of the Job Scheduling Strategies for Parallel Processing
Selective Reservation Strategies for Backfill Job Scheduling
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
Effective Metacomputing using LSF MultiCluster
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
The workload on parallel supercomputers: modeling the characteristics of rigid jobs
Journal of Parallel and Distributed Computing
Failure Data Analysis of a Large-Scale Heterogeneous Server Environment
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Benefits of Global Grid Computing for Job Scheduling
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Experimental Analysis of the Root Causes of Performance Evaluation Results: A Backfilling Case Study
IEEE Transactions on Parallel and Distributed Systems
A large-scale study of failures in high-performance computing systems
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
Future Generation Computer Systems
A toolkit for modelling and simulating data Grids: an extension to GridSim
Concurrency and Computation: Practice & Experience
On the dynamic resource availability in grids
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
Computational models and heuristic methods for Grid scheduling problems
Future Generation Computer Systems
Alea 2: job scheduling simulator
Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques
Performance implications of failures in large-scale cluster scheduling
JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing
Modeling user runtime estimates
JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing
International Journal of Approximate Reasoning
Fuzzy scheduling with swarm intelligence-based knowledge acquisition for grid computing
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
This paper has been inspired by the study of the complex data set from the Czech National Grid MetaCentrum. Unlike other widely used workloads from Parallel Workloads Archive or Grid Workloads Archive, this data set includes additional information concerning machine failures, job requirements and machine parameters which allows to perform more realistic simulations. We show that large differences in the performance of various scheduling algorithms appear when these additional information are used. Moreover, we studied other publicly available workloads and partially reconstructed information concerning their machine failures and job requirements using statistical and analytical models to demonstrate that similar behavior is also expectable for other workloads. We suggest that additional information about both machines and jobs should be incorporated into the workloads archives to allow proper and more realistic simulations.