A comparison of the use of virtual versus physical snapshots for supporting update-intensive workloads

  • Authors:
  • Darius Šidlauskas;Christian S. Jensen;Simonas Šaltenis

  • Affiliations:
  • Aalborg University;Aarhus University;Aalborg University

  • Venue:
  • DaMoN '12 Proceedings of the Eighth International Workshop on Data Management on New Hardware
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Deployments of networked sensors fuel online applications that feed on real-time sensor data. This scenario calls for techniques that support the management of workloads that contain queries as well as very frequent updates. This paper compares two well-chosen approaches to exploiting the parallelism offered by modern processors for supporting such workloads. A general approach to avoiding contention among parallel hardware threads and thus exploiting the parallelism available in processors is to maintain two copies, or snapshots, of the data: one for the relatively long-duration queries and one for the frequent and very localized updates. The snapshot that receives the updates is frequently made available to queries, so that queries see up-to-date data. The snapshots may be physical or virtual. Physical snapshots are created using the C library memcpy function. Virtual snapshots are created by the fork system function that creates a new process that initially has the same data snapshot as the process it was forked from. When the new process carries out updates, this triggers the actual memory copying in a copy-on-write manner at memory page granularity. This paper characterizes the circumstances under which each technique is preferable. The use of physical snapshots is surprisingly efficient.