ACM Transactions on Programming Languages and Systems (TOPLAS)
The SPARC architecture manual: version 8
The SPARC architecture manual: version 8
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Thread scheduling for multiprogrammed multiprocessors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Hoard: a scalable memory allocator for multithreaded applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Proceedings of the 3rd international symposium on Memory management
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Even Better DCAS-Based Concurrent Deques
DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Obstruction-Free Synchronization: Double-Ended Queues as an Example
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Hazard Pointers: Safe Memory Reclamation for Lock-Free Objects
IEEE Transactions on Parallel and Distributed Systems
Scalable lock-free dynamic memory allocation
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Nonblocking memory management support for dynamic-sized data structures
ACM Transactions on Computer Systems (TOCS)
Dynamic circular work-stealing deque
Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
MetaTM/TxLinux: transactional memory for an operating system
Proceedings of the 34th annual international symposium on Computer architecture
SNZI: scalable NonZero indicators
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
TxLinux: using and managing hardware transactional memory in an operating system
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Early experience with a commercial hardware transactional memory implementation
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
The design of a task parallel library
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Understanding PARSEC performance on contemporary CMPs
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Early experience with a commercial hardware transactional memory implementation
Early experience with a commercial hardware transactional memory implementation
The Art of Multiprocessor Programming
The Art of Multiprocessor Programming
Nonblocking algorithms and backward simulation
DISC'09 Proceedings of the 23rd international conference on Distributed computing
DISC'06 Proceedings of the 20th international conference on Distributed Computing
The inherent complexity of transactional memory and what to do about it
Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Invited paper: the inherent complexity of transactional memory and what to do about it
ICDCN'11 Proceedings of the 12th international conference on Distributed computing and networking
Single-version STMs can be multi-version permissive
ICDCN'11 Proceedings of the 12th international conference on Distributed computing and networking
Hybrid NOrec: a case study in the effectiveness of best effort hardware transactional memory
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Cache index-aware memory allocation
Proceedings of the international symposium on Memory management
On the power of hardware transactional memory to simplify memory management
Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
STM in the small: trading generality for performance in software transactional memory
Proceedings of the 7th ACM european conference on Computer Systems
What kinds of applications can benefit from transactional memory?
ISCA'10 Proceedings of the 2010 international conference on Computer Architecture
Dynamic synthesis for relaxed memory models
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Delegation and nesting in best-effort hardware transactional memory
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Using hardware transactional memory to correct and simplify and readers-writer lock algorithm
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Opportunities and pitfalls of multi-core scaling using hardware transaction memory
Proceedings of the 4th Asia-Pacific Workshop on Systems
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
We explore the potential of hardware transactional memory (HTM) to improve concurrent algorithms. We illustrate a number of use cases in which HTM enables significantly simpler code to achieve similar or better performance than existing algorithms for conventional architectures. We use Sun's prototype multicore chip, code-named Rock, to experiment with these algorithms, and discuss ways in which its limitations prevent better results, or would prevent production use of algorithms even if they are successful. Our use cases include concurrent data structures such as double ended queues, work stealing queues and scalable non-zero indicators, as well as a scalable malloc implementation and a simulated annealing application. We believe that our paper makes a compelling case that HTM has substantial potential to make effective concurrent programming easier, and that we have made valuable contributions in guiding designers of future HTM features to exploit this potential.