A survey of process migration mechanisms
ACM SIGOPS Operating Systems Review
The limited performance benefits of migrating active processes for load sharing
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The utility of exploiting idle workstations for parallel computation
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Exploiting process lifetime distributions for dynamic load balancing
ACM Transactions on Computer Systems (TOCS)
Application level scheduling of gene sequence comparison on metacomputers
ICS '98 Proceedings of the 12th international conference on Supercomputing
Gallop: the benefits of wide-area computing for parallel processing
Journal of Parallel and Distributed Computing
Application-level scheduling on distributed heterogeneous networks
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
The MicroGrid: a scientific tool for modeling computational gridsr
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Adaptive load migration systems for PVM
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
IEEE Parallel & Distributed Technology: Systems & Technology
Adaptive Parallelism and Piranha
Computer
Parallel Processing on Dynamic Resources with CARMI
IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Modeling the Effects of Contention on the Performance of Heterogeneous Applications
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Customized dynamic load balancing for a network of workstations
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Run-time Support for Scheduling Parallel Applications in Heterogeneous NOWs
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Prediction and Adaptation in Active Harmony
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing
Efficient Fine-Grain Thread Migration with Active Threads
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Integrated scheduling: the best of both worlds
Journal of Parallel and Distributed Computing
Adaptive data parallel computing on workstation clusters
Journal of Parallel and Distributed Computing
Adaptive middleware supporting scalable performance for high-end network services
Journal of Network and Computer Applications
Study of load balancing strategies for finite element computations on heterogeneous clusters
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Hi-index | 0.01 |
This paper examines the problem of adapting data parallel applications in a shared dynamic environment of PC or workstation clusters. We developed an analytic framework to compare and contrast a wide range of adaptation strategies: dynamic load balancing, migration, processor addition and removal. These strategies have been evaluated with respect to the cost and benefit they provide for three representative parallel applications: an iterative jacobi solver for Laplace's equation, gaussian elimination with partial pivoting, and a gene sequence comparison application. We found that the cost and benefit of each method can be predicted with high accuracy (within 10%) for all applications and show that the framework is applicable to a wide variety of parallel applications. We then show that accurate prediction allows the most appropriate method to be selected dynamically. Performance improvement for the three applications ranged from 25% to 45% using our adaptation library. In addition, we dispel the conventional wisdom that migration is too expensive, and show that it can be beneficial even for running parallel applications with non-trivial communication.