The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Future Generation Computer Systems - Special issue on metacomputing
The Basic Practice of Statistics with Cdrom
The Basic Practice of Statistics with Cdrom
Time Series Analysis: Forecasting and Control
Time Series Analysis: Forecasting and Control
Predicting Queue Times on Space-Sharing Parallel Computers
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
The ANL/IBM SP Scheduling System
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Parallel Job Scheduling: Issues and Approaches
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Towards Convergence in Job Schedulers for Parallel Supercomputers
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Dynamic vs. Static Quantum-Based Parallel Processor Allocation
IPPS '96 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Using Queue Time Predictions for Processor Allocation
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Grid Computing: Making the Global Infrastructure a Reality
Grid Computing: Making the Global Infrastructure a Reality
Modeling machine availability in enterprise and wide-area distributed computing environments
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
On-Demand High Performance Computing: Image Guided Neuro-Surgery Feasibility Study
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 2
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
GridSAT: a system for solving satisfiability problems using a computational grid
Parallel Computing - Optimization on grids - Optimization for grids
Beyond Performance Tools: Measuring and Modeling Productivity in HPC
SE-HPC '07 Proceedings of the 3rd International Workshop on Software Engineering for High Performance Computing Applications
A statistical approach to risk mitigation in computational markets
Proceedings of the 16th international symposium on High performance distributed computing
Automatic resource specification generation for resource selection
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
VARQ: virtual advance reservations for queues
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Feedback-controlled resource sharing for predictable eScience
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
On the Efficacy of Computation Offloading Decision-Making Strategies
International Journal of High Performance Computing Applications
Adaptive pricing for resource reservations in Shared environments
GRID '07 Proceedings of the 8th IEEE/ACM International Conference on Grid Computing
Trace-based evaluation of job runtime and queue wait time predictions in grids
Proceedings of the 18th ACM international symposium on High performance distributed computing
GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
Future Generation Computer Systems
A simulation toolkit to investigate the effects of grid characteristics on workflow completion time
Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science
VGrADS: enabling e-Science workflows on grids and clouds with fault tolerance
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Distributed Radiotherapy Simulation with the Webcom Workflow System
International Journal of High Performance Computing Applications
QBETS: queue bounds estimation from time series
JSSPP'07 Proceedings of the 13th international conference on Job scheduling strategies for parallel processing
TeraGrid resource selection tools: a road test
Proceedings of the 2010 TeraGrid Conference
Comparison of resource platform selection approaches for scientific workflows
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Power-aware proactive storage-tiering management for high-speed tiered-storage systems
SustainIT'10 Proceedings of the First USENIX conference on Sustainable information technology
Self-adaptive architectures for autonomic computational science
SOAR'09 Proceedings of the First international conference on Self-organizing architectures
Hybrid Computing-Where HPC meets grid and Cloud Computing
Future Generation Computer Systems
Service control with the preemptive parallel job scheduler Scojo-PECT
Cluster Computing
Modeling and synthesizing task placement constraints in Google compute clusters
Proceedings of the 2nd ACM Symposium on Cloud Computing
Adaptive Executions of Multi-Physics Coupled Applications on Batch Grids
Journal of Grid Computing
Hi-index | 0.00 |
Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources have accounts at multiple sites and have the option of choosing at which site or sites to submit a parallel job. In such a situation, the amount of time a user's job will wait in any one batch queue can significantly impact the overall time a user waits from job submission to job completion. In this work, we explore a new method for providing end-users with predictions for the bounds on the queuing delay individual jobs will experience. We evaluate this method using batch scheduler logs for distributed-memory parallel machines that cover a 9-year period at 7 large HPC centers.Our results show that it is possible to predict delay bounds reliably for jobs in different queues, and for jobs requesting different ranges of processor counts. Using this information, scientific application developers can intelligently decide where to submit their parallel codes in order to minimize overall turnaround time.