Scheduling and page migration for multiprocessor compute servers

  • Authors:
  • Rohit Chandra;Scott Devine;Ben Verghese;Anoop Gupta;Mendel Rosenblum

  • Affiliations:
  • Computer Systems Laboratory, Stanford University, Stanford CA;Computer Systems Laboratory, Stanford University, Stanford CA;Computer Systems Laboratory, Stanford University, Stanford CA;Computer Systems Laboratory, Stanford University, Stanford CA;Computer Systems Laboratory, Stanford University, Stanford CA

  • Venue:
  • ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several cache-coherent shared-memory multiprocessors have been developed that are scalable and offer a very tight coupling between the processing resources. They are therefore quite attractive for use as compute servers for multiprogramming and parallel application workloads. Process scheduling and memory management, however, remain challenging due to the distributed main memory found on such machines. This paper examines the effects of OS scheduling and page migration policies on the performance of such compute servers. Our experiments are done on the Stanford DASH, a distributed-memory cache-coherent multiprocessor. We show that for our multiprogramming workloads consisting of sequential jobs, the traditional Unix scheduling policy does very poorly. In contrast, a policy incorporating cluster and cache affinity along with a simple page-migration algorithm offers up to two-fold performance improvement. For our workloads consisting of multiple parallel applications, we compare space-sharing policies that divide the processors among the applications to time-slicing policies such as standard Unix or gang scheduling. We show that space-sharing policies can achieve better processor utilization due to the operating point effect, but time-slicing policies benefit strongly from user-level data distribution. Our initial experience with automatic page migration suggests that policies based only on TLB miss information can be quite effective, and useful for addressing the data distribution problems of space-sharing schedulers.