LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
A worldwide flock of Condors: load sharing among workstation clusters
Future Generation Computer Systems - Special issue: resource management in distributed systems
The Legion vision of a worldwide virtual computer
Communications of the ACM
Mathematical Programming: Series A and B
Application level scheduling of gene sequence comparison on metacomputers
ICS '98 Proceedings of the 12th international conference on Supercomputing
Predictive analysis of a wavefront application using LogGP
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Application-level scheduling on distributed heterogeneous networks
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
The Autopilot performance-directed adaptive control system
Future Generation Computer Systems - I. High Performance Numerical Methods and Applications. II. Performance Data Mining: Automated Diagnosis, Adaption, and Optimization
Applying scheduling and tuning to on-line parallel tomography
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Dynamic load balancing of SAMR applications on distributed systems
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
A Framework for Automatic Adaptation of Tunable Distributed Applications
Cluster Computing
The Globus Project: A Status Report
HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Application-Aware Scheduling of a Magnetohydrodynamics Application in the Legion Metasystem
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Predictive Application-Performance Modeling in a Computational Grid Environment
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
An Enabling Framework for Master-Worker Applications on the Computational Grid
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
The Cactus Code: A Problem Solving Environment for the Grid
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
Exposing Application Alternatives
ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
Decomposition Algorithms for Stochastic Programming on a Computational Grid
Computational Optimization and Applications
Computational Optimization and Applications
Efficient resource management applied to master-worker applications
Journal of Parallel and Distributed Computing - Special issue on middleware
Application-Specific Scheduling for the Organic Grid
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Using application information to drive adaptive grid middleware scheduling decisions
Proceedings of the 2nd workshop on Middleware-application interaction: affiliated with the DisCoTec federated conferences 2008
Modular, Fine-Grained Adaptation of Parallel Programs
ICCS '09 Proceedings of the 9th International Conference on Computational Science: Part I
Hi-index | 0.00 |
This paper develops a performance model that is used to control the adaptive execution the ATR code for solving large stochastic optimization problems on computational grids. A detailed analysis of the execution characteristics of ATR is used to construct the performance model that is then used to specify (a) near-optimal dynamic values of parameters that govern the distribution of work, and (b) a new task scheduling algorithm. Together, these new features minimize ATR execution time on any collection of compute nodes, including a varying collection of heterogeneous nodes. The new adaptive code runs up to eight-fold faster than the previously optimized code, and requires no input parameters from the user to guide the distribution of work. Furthermore, the modeling process led to several changes in the Condor runtime environment, including the new task scheduling algorithm, that produce significant performance improvements for master-worker computations as well as possibly other types of grid applications.