Opportunities and pitfalls of multi-core scaling using hardware transaction memory

Authors:
Zhaoguo Wang;Hao Qian;Haibo Chen;Jinyang Li
Affiliations:
Fudan University;Shanghai Jiao Tong University;Shanghai Jiao Tong University;New York University
Venue:
Proceedings of the 4th Asia-Pacific Workshop on Systems
Year:
2013

Citing 11
Cited 0

Skip lists: a probabilistic alternative to balanced trees

Communications of the ACM
Transactional memory: architectural support for lock-free data structures

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Lock-free linked lists and skip lists

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Early experience with a commercial hardware transactional memory implementation

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Rock: A High-Performance Sparc CMT Processor

IEEE Micro
Evaluation of AMD's advanced synchronization facility within a complete transactional memory stack

Proceedings of the 5th European conference on Computer systems
Simplifying concurrent algorithms by exploiting hardware transactional memory

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Evaluation of Blue Gene/Q hardware support for transactional memories

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Using hardware transactional memory to correct and simplify and readers-writer lock algorithm

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hardware transactional memory, which holds the promise to simplify and scale up multicore synchronization, has recently become available in main stream processors in the form of Intel's restricted transactional memory (RTM). Will RTM be a panacea for multi-core scaling? This paper tries to shed some light on this question by studying the performance scalability of a concurrent skip list using competing synchronization techniques, including fine-grained locking, lock-free and RTM (using both Intel's RTM emulator and a real RTM machine). Our experience suggests that RTM indeed simplifies the implementation, however, a lot of care must be taken to get good performance. Specifically, to avoid excessive aborts due to RTM capacity miss or conflicts, programmers should move memory allocation/deallocation out of RTM region, tuning fallback functions, and use compiler optimization.