Adaptive scheduling under memory constraints on non-dedicated computational farms

  • Authors:
  • Dimitrios S. Nikolopoulos;Constantine D. Polychronopoulos

  • Affiliations:
  • Department of Computer Science, The College of William and Mary, McGlothlin Street Hall, Williamsburg, VA;Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL

  • Venue:
  • Future Generation Computer Systems - Selected papers from CCGRID 2002
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents scheduler extensions that enable better adaptation of parallel programs to the execution conditions of non-dedicated computational farms with limited memory resources. The purpose of the techniques is to prevent thrashing and co-schedule communicating threads, using two disjoint, yet cooperating extensions to the kernel scheduler. A thrashing prevention module enables memory-bound programs to adapt to memory shortage, via suspending their threads at selected points of execution. Thread suspension is used so that memory is not over-committed by parallel jobs--which are assumed to be running as guests on the nodes of the computational farm--at memory allocation points. In the event of thrashing, parallel jobs are the first to release memory and help local resident jobs make progress. Adaptation is implemented using a shared-memory interface in the/proc filesystem and upcalls from the kernel to the user space. On an orthogonal axis, co-scheduling is implemented in the kernel with a heuristic that boosts periodically the priority of communicating threads.Using experiments on a cluster of workstations, we show that when a guest parallel job competes with general-purpose interactive, I/O-intensive, or CPU and memory-intensive load on the nodes of the cluster, thrashing prevention reduces drastically the slowdown of the job at memory utilization levels of 20% or higher. The slowdown of parallel jobs is reduced by up to a factor of 7. Co-scheduling provides a limited performance improvement at memory utilization levels below 20%, but has no significant effect at higher memory utilization levels.