A case for tracking and exploiting inter-node and intra-node memory content sharing in virtualized large-scale parallel systems

  • Authors:
  • Lei Xia;Peter A. Dinda

  • Affiliations:
  • Northwestern University, Evanston, IL, USA;Northwestern University, Evanston, IL, USA

  • Venue:
  • Proceedings of the 6th international workshop on Virtualization Technologies in Distributed Computing Date
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In virtualized large-scale parallel systems scientific workloads consist of numerous processes running across many virtual nodes. Their memory footprint is massive, and this has consequences for services that enhance performance, reliability, or power. We argue that a service that dynamically tracks the sharing of memory content, both within individual nodes, and across nodes, can simplify and enhance the implementation of such services. For example, leveraging content sharing could significantly reduce the size of a checkpoint of a group of nodes. As another example, it could speed VM migration by allowing the reconstruction of a VM's memory from multiple source VMs. Finally, a service that improves reliability by introducing memory redundancy could leverage existing content sharing to minimize the memory costs of any particular level of redundancy. We argue that both intra- and inter-node memory content sharing is common in parallel applications, supporting this claim by a detailed study of both kinds of sharing, at different scales, different granularities, and different times for a range of applications and application benchmarks. We then describe the high level approach we are taking to design and implement a distributed, VMM-based system that can efficiently and scalably identify and track such sharing with low overhead.