Mileage-based contention management in transactional memory

  • Authors:
  • Woojin Choi;Lihang Zhao;Jeff Draper

  • Affiliations:
  • University of Southern California, Marina del Rey, CA, USA;University of Southern California, Marina del Rey, CA, USA;University of Southern California, Marina del Rey, CA, USA

  • Venue:
  • Proceedings of the 21st international conference on Parallel architectures and compilation techniques
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Transactional Memory (TM), a conflict occurs when a memory block is accessed concurrently by two or more transactions and at least one of them is a write access. The management of conflicts significantly impacts TM performance. There are two alternative approaches for managing conflicts: Reactive Contention Management (RCM) [1] and Proactive Contention Management (PCM) [2]. Previous contention management schemes treat all transactions with no weights, and make a decision based on the information provided by the running transaction instance. In this work, we suggest that all critical sections (transactions) are not equally performance-critical. Among the transactions from a program, some transactions are more important than others with respect to the performance of the implemented algorithm; e.g., the producer transaction in the producer-consumer relationship. It is worthy to distinguish the performance-critical transactions from others for speeding up the overall execution. For this purpose, we propose a mileage technique and show its effectiveness in the contexts of RCM and PCM. To express the criticality of transactions, we define new instructions, MILEAGE and MRSTCNT. MILEAGE has one operand, mileage id (mid). A mid indicates how far a thread progresses and monotonically increases during the program execution. Each processor has a mileage unit. A mileage unit maintains the current mid and a mileage counter (mcnt), which tracks the number of times that MILEAGE has been executed with the current mid as its operand. When MILEAGE with a new mid is shown, that mid is stored in the mid register and the mcnt register is cleared. Every time the same mid appears again, the mcnt register is incremented. MRSTCNT is used to clear the mcnt register. When two threads contend with each other, the thread with the smaller mileage value (mid concatenated with mcnt) receives higher priority. MILEAGE and MRSTCNT were inserted manually based on source code analysis and performance profiling. If a conflict is detected, one of conflicting transactions can continue its execution and the others stall or abort to maintain correctness. Traditional RCMs decide which transaction continues its execution based on information from the current instance. For example, age RCM selects the transaction that has started earlier and size RCM selects the transaction which has accessed more memory blocks [1]. The decision from mileage RCM is based on the relative importance of each transaction from the program flow (mid) as well as dynamic flow (mcnt). On a conflict, mileage RCM chooses the transaction with the smaller mileage value. From our experiments, mileage RCM provides prominent performance improvements with benchmarks that have performance critical transactions (bayes and intruder). Also, mileage RCM shows no severe speed-down across all the other benchmarks we evaluated. Mileage RCM achieves average speedups of 12.52% over age RCM (23.45% over size RCM). Conflicts can be prevented by throttling the number of concurrently-running transactions. We propose Speculative Lock Insertion (SLI). After a transaction experiences aborts more than three times, it sets a global lock upon restart. If the transaction which holds the global lock commits, it resets the global lock. Every time a thread encounters a transaction, it first checks the global lock. If the global lock is set, the thread waits until it is released before starting a transaction. Otherwise, the transaction executes. In mileage-based SLI, the aborted transaction not only sets the global lock but also registers its mileage value. When a thread finds the lock is set, it compares its mileage value to that in the global location. If it has the smaller value, it starts execution, ignoring the lock. Because locking in SLI is for performance only, this does not impact correctness issues. The correctness is still maintained by the underlying TM system. From our experiments, mileage-based SLI achieves average speedups of 9.55% over Adaptive Transaction Scheduling [2].