Unbounded Transactional Memory

  • Authors:
  • C. Scott Ananian;Krste Asanovic;Bradley C. Kuszmaul;Charles E. Leiserson;Sean Lie

  • Affiliations:
  • MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA

  • Venue:
  • HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM, which exploits the common case for performance without sacrificing correctness on transactions whose footprint can be nearly as large as virtual memory. We performed a cycle-accurate simulation of a simplified architecture, called LTM. LTM is based on UTM but is easier to implement, because it does not change the memory subsystem outside of the processor. LTM allows nearly unbounded transactions, whose footprint is limited only by physical memory size and whose duration by the length of a timeslice. We assess UTM and LTM through microbenchmarking and by automatically converting the SPECjvm98 Java benchmarks and the Linux 2.4.19 kernel to use transactions instead of locks. We use both cycle-accurate simulation and instrumentation to understand benchmark behavior. Our studies show that the common case is small transactions that commit, even when contention is high, but that some applications contain very large transactions. For example, although 99.9% of transactions in the Linux study touch 54 cache lines or fewer, some transactions touch over 8000 cache lines. Our studies also indicate that hardware support is required, because some applications spend over half their time in critical regions. Finally, they suggest that hardware support for transactions can make Java programs runfaster than when run using locks and can increase the concurrency of the Linux kernel by as much as a factor of 4 with no additional programming work.