Transparently consistent asynchronous shared memory

  • Authors:
  • Hakan Akkan;Latchesar Ionkov;Michael Lang

  • Affiliations:
  • New Mexico Consortium;Los Alamos National Laboratory;Los Alamos National Laboratory

  • Venue:
  • Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The advent of many-core processors is imposing many changes on the operating system. The resources that are under contention have changed; previously, CPU cycles were the resource in demand and required fair and precise sharing. Now compute cycles are plentiful, but the memory per core is decreasing. In the past, scientific applications used all the CPU cores to finish as fast as possible, with visualization and analysis of the data performed after the simulation finished. With decreasing memory available per core, as well as the higher price (in power and time) for storing data on disk or sending it over the network, it now makes sense to run visualization and analytics applications in-situ, while the application is running. Visualization and analytics applications then need to sample the simulation memory with as little interference and as little changes in the simulation code as possible. We propose an asynchronous memory sharing facility that allows consistent states of the memory to be shared between processes without any implicit or explicit synchronization. We distinguish two types of processes; a single producer and one or more observers. The producer modifies the state of the data, making available consistent versions of the state to any observer. The observers, working at different sampling rates, can access the latest available consistent state. Some applications that would benefit from this type of facility include check-pointing applications, processes monitoring, unobtrusive process debugging, and the sharing of data for visualization or analytics. To evaluate our ideas we have developed two kernel-level implementations for sharing data asynchronously and we compared these implementations to a traditional user-space synchronized multi-buffer method. We have seen improvements of up to 3.5x in our tests over the traditional multi-buffer method with 20% of the data pages touched.