Quantitative system performance: computer system analysis using queueing network models
Quantitative system performance: computer system analysis using queueing network models
A simple load balancing scheme for task allocation in parallel machines
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
On the versatility of parallel sorting by regular sampling
Parallel Computing
GASS: a data movement and access service for wide area computing systems
Proceedings of the sixth workshop on I/O in parallel and distributed systems
SSH, The Secure Shell: The Definitive Guide
SSH, The Secure Shell: The Definitive Guide
LegionFS: a secure and scalable file system supporting cross-domain high-performance applications
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Theory and Practice in Parallel Job Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
A Resource Management Architecture for Metacomputing Systems
IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
User-Level Remote Data Access in Overlay Metacomputers
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
Condor-G: A Computation Management Agent for Multi-Institutional Grids
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
File and Object Replication in Data Grids
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Tuning evaluation functions by maximizing concordance
Theoretical Computer Science - Advances in computer games
The Trellis security infrastructure for overlay metacomputers and bridged distributed file systems
Journal of Parallel and Distributed Computing - Special issue: Security in grid and distributed systems
Bridging local and wide area networks for overlay distributed file systems
WORLDS'05 Proceedings of the 2nd conference on Real, Large Distributed Systems - Volume 2
Workflow task clustering for best effort systems with Pegasus
Proceedings of the 15th ACM Mardi Gras conference: From lightweight mash-ups to lambda grids: Understanding the spectrum of distributed computing requirements, applications, tools, infrastructures, interoperability, and the incremental adoption of key capabilities
The XtreemOS jScheduler: using self-scheduling techniques in large computing architectures
LASCO'08 First USENIX Workshop on Large-Scale Computing
A decentralized model for scheduling independent tasks in Federated Grids
Future Generation Computer Systems
Distributed Radiotherapy Simulation with the Webcom Workflow System
International Journal of High Performance Computing Applications
A job self-scheduling policy for HPC infrastructures
JSSPP'07 Proceedings of the 13th international conference on Job scheduling strategies for parallel processing
Processing moldable tasks on the grid: Late job binding with lightweight user-level overlay
Future Generation Computer Systems
CASP: a community-aware scheduling protocol
International Journal of Grid and Utility Computing
Grid workflow software for a high-throughput proteome annotation pipeline
LSGRID'04 Proceedings of the First international conference on Life Science Grid
Distributed memorization for the k-vertex cover problem
ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Resource aggregation and workflow with webcom
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Hi-index | 0.00 |
A practical problem faced by users of high-performance computers is: How can I automatically load balance my jobs across different batchq ueues, whichare in different administrative domains, if there is no existing grid infrastructure? It is common to have user accounts for a number of individual high-performance systems (e.g., departmental, university, regional) that are administered by different groups. Without an administration-deployed grid infrastructure, one can still create a purely user-level aggregation of individual computing systems.The Trellis Project is developing the techniques and tools to take advantage of a user-level overlay metacomputer. Because placeholder scheduling does not require superuser permissions to set up or configure, it is well-suited to overlay metacomputers. This paper contributes to the practical side of grid and metacomputing by empirically demonstrating that placeholder scheduling can work across different administrative domains, across different local schedulers (i.e., PBS and Sun Grid Engine), and across different programming models (i.e., Pthreads, MPI, and sequential). We also describe a new metaqueue system to manage jobs with explicit workflow dependencies.