Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Process structuring, synchronization, and recovery using atomic actions
Proceedings of an ACM conference on Language design for reliable software
Subtleties of Transactional Memory Atomicity Semantics
IEEE Computer Architecture Letters
JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
On the correctness of transactional memory
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Reducing Memory Ordering Overheads in Software Transactional Memory
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
NOrec: streamlining STM by abolishing ownership records
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
CUDA by Example: An Introduction to General-Purpose GPU Programming
CUDA by Example: An Introduction to General-Purpose GPU Programming
Eigenbench: A simple exploration tool for orthogonal TM characteristics
IISWC '10 Proceedings of the IEEE International Symposium on Workload Characterization (IISWC'10)
High-throughput transaction executions on graphics processors
Proceedings of the VLDB Endowment
DISC'06 Proceedings of the 20th international conference on Distributed Computing
A GPU implementation of inclusion-based points-to analysis
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Hardware transactional memory for GPU architectures
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Sandboxing transactional memory
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Towards a software transactional memory for graphics processors
EG PGV'10 Proceedings of the 10th Eurographics conference on Parallel Graphics and Visualization
Atomic-free irregular computations on GPUs
Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units
Hi-index | 0.00 |
Modern GPUs have shown promising results in accelerating computation intensive and numerical workloads with limited dynamic data sharing. However, many real-world applications manifest ample amount of data sharing among concurrently executing threads. Often data sharing requires mutual exclusion mechanism to ensure data integrity in multithreaded environment. Although modern GPUs provide atomic primitives that can be leveraged to construct fine-grained locks, lock-based synchronization requires significant programming efforts to achieve functional correctness. The massive multithreading and SIMT execution paradigm of GPUs further extend the challenges of GPU locks. To make applications with dynamic data sharing benefit from GPU acceleration, we propose a novel software transactional memory system for GPU architectures (GPU-STM). The major challenges include ensuring good scalability with respect to the massive multithreading of GPUs, and preventing livelocks caused by the SIMT execution paradigm of GPUs. To this end, we propose (1) a hierarchical validation technique and (2) an encounter-time lock-sorting mechanism to deal with the two challenges, respectively. We build our GPU-STM prototype based on the commercially available GPU platform and runtime. Our real system based evaluation shows that GPU-STM outperforms coarse-grain locks on GPUs by up to 20x.