Safe compiler-driven transaction checkpointing and recovery

  • Authors:
  • Jaswanth Sreeram;Santosh Pande

  • Affiliations:
  • Intel Labs, Santa Clara, CA, USA;Georgia Institute of Technology, Atlanta, GA, USA

  • Venue:
  • Proceedings of the ACM international conference on Object oriented programming systems languages and applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Several studies have shown that a large fraction of the work performed inside memory transactions in representative programs is wasted due to the transaction experiencing a conflict and aborting. Aborts inside long running transactions are especially influential to performance and the simplicity of the TM programming model (relative to using finegrained locking) in synchronizing large critical sections means that large transactions are common and this exacerbates the problem of wasted work. In this paper we present a practical transaction checkpoint and recovery scheme in which transactions that experience a conflict can restore their state (including the local context in which they were executing) to some dynamic program point before this access and begin execution from that point. This state saving and restoration is implemented by checkpoint operations that are generated by a compiler into the transaction's body and are also optimized to reduce the amount of state that is saved and restored. We also describe a runtime system that manages these checkpointed states and orchestrates the restoration of the right checkpointed state for a conflict on a particular transactional access. Moreover the synthesis of these save and restore operations, their optimization and invocation at runtime are completely transparent to the programmer. We have implemented the checkpoint generation and optimization scheme in the LLVM compiler and runtime support for the TL2 STM system. Our experiments indicate that for many parallel programs using such checkpoint recovery schemes can result in upto several orders of magnitude reduction in number of aborts and significant execution time speedups relative to plain transactional programs for the same number of threads.