Skip lists: a probabilistic alternative to balanced trees
Communications of the ACM
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Lock-free linked lists and skip lists
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Early experience with a commercial hardware transactional memory implementation
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Rock: A High-Performance Sparc CMT Processor
IEEE Micro
Evaluation of AMD's advanced synchronization facility within a complete transactional memory stack
Proceedings of the 5th European conference on Computer systems
Simplifying concurrent algorithms by exploiting hardware transactional memory
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Evaluation of Blue Gene/Q hardware support for transactional memories
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Using hardware transactional memory to correct and simplify and readers-writer lock algorithm
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
Hardware transactional memory, which holds the promise to simplify and scale up multicore synchronization, has recently become available in main stream processors in the form of Intel's restricted transactional memory (RTM). Will RTM be a panacea for multi-core scaling? This paper tries to shed some light on this question by studying the performance scalability of a concurrent skip list using competing synchronization techniques, including fine-grained locking, lock-free and RTM (using both Intel's RTM emulator and a real RTM machine). Our experience suggests that RTM indeed simplifies the implementation, however, a lot of care must be taken to get good performance. Specifically, to avoid excessive aborts due to RTM capacity miss or conflicts, programmers should move memory allocation/deallocation out of RTM region, tuning fallback functions, and use compiler optimization.