ATOP-space and time adaptation for parallel and grid applications via flexible data partitioning

Authors:
Angela C. Sodan;Lin Han
Affiliations:
University of Windsor, Windsor, Ontario, Canada;University of Windsor, Windsor, Ontario, Canada
Venue:
ARM '04 Proceedings of the 3rd workshop on Adaptive and reflective middleware
Year:
2004

Citing 14
Cited 4

Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Processor allocation in multiprogrammed distributed-memory parallel computer systems

Journal of Parallel and Distributed Computing
Multilevel diffusion schemes for repartitioning of adaptive meshes

Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
The Hector Distributed Run-Time Environment

IEEE Transactions on Parallel and Distributed Systems
Adaptive two-level thread management for fast MPI execution on shared memory machines

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Design of dynamic load-balancing tools for parallel applications

Proceedings of the 14th international conference on Supercomputing
Dynamic load balancing for parallel structural mechanics simulations with DRAMA

Developments in engineering computational technology
A unified algorithm for load-balancing adaptive scientific simulations

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
When the Herd Is Smart: Aggregate Behavior in the Selection of Job Request

IEEE Transactions on Parallel and Distributed Systems
Adaptive Load Balancing for MPI Programs

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Dynamic Load Balancing in Crashworthiness Simulation

VECPAR '98 Selected Papers and Invited Talks from the Third International Conference on Vector and Parallel Processing
Latency Hiding in Dynamic Partitioning and Load Balancing of Grid Computing Applications

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
An Evaluation of a Framework for the Dynamic Load Balancing of Highly Adaptive and Irregular Parallel Applications

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
LOMARC — lookahead matchmaking for multi-resource coscheduling

JSSPP'04 Proceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing

Time vs. space adaptation with ATOP-grid

Proceedings of the 5th workshop on Adaptive and reflective middleware (ARM '06)
Time and space adaptation for computational grids with the ATOP-Grid middleware

Future Generation Computer Systems
Adaptive job scheduling via predictive job resource allocation

JSSPP'06 Proceedings of the 12th international conference on Job scheduling strategies for parallel processing
ScoPred–scalable user-directed performance prediction using complexity modeling and historical data

JSSPP'05 Proceedings of the 11th international conference on Job Scheduling Strategies for Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Adaptive resource allocation is becoming an important feature to run parallel and grid applications: to better share space and time according to current workload, to schedule around obstacles as from reservation, to deal with varying system load under time-shared execution, and to deal with lack of accurate predictability on heterogeneous resources. Adaptation is potentially very expensive if total data repartitioning is required. Existing approaches of implementing large numbers of MPI "processes" via threads suffer from frequent thread switches, inefficient local communication, and being fixed to the chosen number of threads. Our ATOP middleware provides an approach which uses as many processes as there are processors and partitions and migrates the data, while processing the data per process as one data collection. For the partitioning and migration, we employ the Zoltan load-balancing library which is highly portable and supports a large variety of load-balancing approaches, including those of ParMETIS and Jostle. Exploiting features of Zoltan, we propose pre-partitioning (over-partitioning) of data graphs (reducing adaptation cost down to 25%) but can also flexibly decide to partition from scratch (for cases where over-partitioning does not perform well or where non-fitting numbers of resources need to be chosen).