The AppLeS parameter sweep template: user-level middleware for the grid
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Condor-G: A Computation Management Agent for Multi-Institutional Grids
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Orchestrating and Coordinating Scientific/Engineering Workflows using GridShell
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Proceedings of the 5th IEEE workshop on Challenges of large applications in distributed environments
Creating private network overlays for high performance scientific computing
Proceedings of the ACM/IFIP/USENIX 2007 International Conference on Middleware
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers
GridBot: execution of bags of tasks in multiple grids
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Creating private network overlays for high performance scientific computing
MIDDLEWARE2007 Proceedings of the 8th ACM/IFIP/USENIX international conference on Middleware
Towards jungle computing with Ibis/Constellation
Proceedings of the 2011 workshop on Dynamic distributed data-intensive applications, programming abstractions, and systems
Hi-index | 0.01 |
We describe a system for creating personal clusters in user-space to support the submission and management of thousands of compute-intensive serial jobs to the network-connected compute resources on the NSF TeraGrid. The system implements a robust infrastructure that submits and manages job proxies across a distributed computing environment. These job proxies contribute resources to personal clusters created dynamically for a user on-demand. The personal clusters then adapt to the prevailing job load conditions at the distributed sites by migrating job proxies to sites expected to provide resources more quickly. Furthermore, the system allows multiple instances of these personal clusters to be created as containers for individual scientific experiments, allowing the submission environment to be customized for each instance. The version of the system described in this paper allows users to build large personal Condor and Sun Grid Engine clusters on the TeraGrid. Users then manage their scientific jobs, within each personal cluster, with a single uniform interface using the feature-rich functionality found in these job management environments.