Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Dhrystone: a synthetic systems programming benchmark
Communications of the ACM
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
PyPy's approach to virtual machine construction
Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications
Hardware tansactional memory support for lightweight dynamic language evolution
Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications
Transactional Memory (Synthesis Lectures on Computer Architecture)
Transactional Memory (Synthesis Lectures on Computer Architecture)
SNZI: scalable NonZero indicators
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Early experience with a commercial hardware transactional memory implementation
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Dependence-aware transactional memory for increased concurrency
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Rock: A High-Performance Sparc CMT Processor
IEEE Micro
Early experience with a commercial hardware transactional memory implementation
Early experience with a commercial hardware transactional memory implementation
The Definitive Guide to Jython: Python for the Java Platform
The Definitive Guide to Jython: Python for the Java Platform
RETCON: transactional repair without replay
Proceedings of the 37th annual international symposium on Computer architecture
Eliminating global interpreter locks in ruby through hardware transactional memory
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.00 |
This paper reports on our experiences of using a commercial processor's best-effort hardware transactional memory to improve concurrency in CPython, the reference Python implementation. CPython protects its data structures using a single global lock, which inhibits parallelism when running multiple threads. We modified the CPython interpreter to use besteffort hardware transactions available in Sun's Rock processor, and fall back on the single global lock when unable to commit in hardware. The modifications were minimal; however, we had to restructure some of CPython's shared data structures to handle false conflicts arising from CPython's management of the shared data. Our results show that the modified CPython interpreter can run small, simple, workloads and scale almost linearly, while improving the concurrency of more complex workloads.