Transparent, lightweight application execution replay on commodity multiprocessor operating systems

  • Authors:
  • Oren Laadan;Nicolas Viennot;Jason Nieh

  • Affiliations:
  • Columbia University, New York, NY, USA;Columbia University, New York, NY, USA;Columbia University, New York, NY, USA

  • Venue:
  • Proceedings of the ACM SIGMETRICS international conference on Measurement and modeling of computer systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present Scribe, the first system to provide transparent, low-overhead application record-replay and the ability to go live from replayed execution. Scribe introduces new lightweight operating system mechanisms, rendezvous and sync points, to efficiently record nondeterministic interactions such as related system calls, signals, and shared memory accesses. Rendezvous points make a partial ordering of execution based on system call dependencies sufficient for replay, avoiding the recording overhead of maintaining an exact execution ordering. Sync points convert asynchronous interactions that can occur at arbitrary times into synchronous events that are much easier to record and replay. We have implemented Scribe without changing, relinking, or recompiling applications, libraries, or operating system kernels, and without any specialized hardware support such as hardware performance counters. It works on commodity Linux operating systems, and commodity multi-core and multiprocessor hardware. Our results show for the first time that an operating system mechanism can correctly and transparently record and replay multi-process and multi-threaded applications on commodity multiprocessors. Scribe recording overhead is less than 2.5% for server applications including Apache and MySQL, and less than 15% for desktop applications including Firefox, Acrobat, OpenOffice, parallel kernel compilation, and movie playback.