Adaptive scheduling under memory constraints on non-dedicated computational farms

Authors:
Dimitrios S. Nikolopoulos;Constantine D. Polychronopoulos
Affiliations:
Department of Computer Science, The College of William and Mary, McGlothlin Street Hall, Williamsburg, VA;Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL
Venue:
Future Generation Computer Systems - Selected papers from CCGRID 2002
Year:
2003

Citing 16
Cited 5

Scheduling memory constrained jobs on distributed memory parallel computers

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Coordinated allocation of memory and processors in multiprocessors

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
The utility of exploiting idle workstations for parallel computation

SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Availability and utility of idle memory in workstation clusters

SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Mechanisms and policies for supporting fine-grained cycle stealing

ICS '99 Proceedings of the 13th international conference on Supercomputing
Algorithmic modifications to the Jacobi-Davidson parallel eigensolver to dynamically balance external CPU and memory load

ICS '01 Proceedings of the 15th international conference on Supercomputing
Implicit coscheduling: coordinated scheduling with implicit information in distributed systems

ACM Transactions on Computer Systems (TOCS)
Paging tradeoffs in distributed-shared-memory multiprocessors

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Adaptive Scheduling under Memory Pressure on Multiprogrammed SMPs

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
The Interaction between Memory Allocation and Adaptive Partitioning in Message-Passing Multicomputers

IPPS '95 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Dynamic Coscheduling on Workstation Clusters

IPPS/SPDP '98 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
Coscheduling under Memory Constraints in a NOW Environment

JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
Gang Scheduling with Memory Considerations

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Adaptive Scheduling under Memory Pressure on Multiprogrammed Clusters

CCGRID '02 Proceedings of the 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid
Condor-G: A Computation Management Agent for Multi-Institutional Grids

HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Adaptive page replacement to protect thrashing in Linux

ALS '01 Proceedings of the 5th annual Linux Showcase & Conference - Volume 5

Adaptive Resource Utilization via Feedback Control for Streaming Applications

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Adaptive Parallel Job Scheduling with Flexible Coscheduling

IEEE Transactions on Parallel and Distributed Systems
Immediate mode scheduling in grid systems

International Journal of Web and Grid Services
Cooperating coscheduling: a coscheduling proposal aimed at non-dedicated heterongeneous NOWs

Journal of Computer Science and Technology
A progressive multi-layer resource reconfiguration framework for time-shared grid systems

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents scheduler extensions that enable better adaptation of parallel programs to the execution conditions of non-dedicated computational farms with limited memory resources. The purpose of the techniques is to prevent thrashing and co-schedule communicating threads, using two disjoint, yet cooperating extensions to the kernel scheduler. A thrashing prevention module enables memory-bound programs to adapt to memory shortage, via suspending their threads at selected points of execution. Thread suspension is used so that memory is not over-committed by parallel jobs--which are assumed to be running as guests on the nodes of the computational farm--at memory allocation points. In the event of thrashing, parallel jobs are the first to release memory and help local resident jobs make progress. Adaptation is implemented using a shared-memory interface in the/proc filesystem and upcalls from the kernel to the user space. On an orthogonal axis, co-scheduling is implemented in the kernel with a heuristic that boosts periodically the priority of communicating threads.Using experiments on a cluster of workstations, we show that when a guest parallel job competes with general-purpose interactive, I/O-intensive, or CPU and memory-intensive load on the nodes of the cluster, thrashing prevention reduces drastically the slowdown of the job at memory utilization levels of 20% or higher. The slowdown of parallel jobs is reduced by up to a factor of 7. Co-scheduling provides a limited performance improvement at memory utilization levels below 20%, but has no significant effect at higher memory utilization levels.