The Wisconsin multicube: a new large-scale cache-coherent multiprocessor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Reducing false sharing on shared memory multiprocessors through compile time data transformations
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A dynamic multithreading processor
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
An architecture for mostly functional languages
LFP '86 Proceedings of the 1986 ACM conference on LISP and functional programming
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
False Sharing and Spatial Locality in Multiprocessor Caches
IEEE Transactions on Computers
A dynamic cache sub-block design to reduce false sharing
ICCD '95 Proceedings of the 1995 International Conference on Computer Design: VLSI in Computers and Processors
A Practical Multi-word Compare-and-Swap Operation
DISC '02 Proceedings of the 16th International Conference on Distributed Computing
Variability in Architectural Simulations of Multi-Threaded Workloads
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Software transactional memory for dynamic-sized data structures
Proceedings of the twenty-second annual symposium on Principles of distributed computing
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Improving Value Communication for Thread-Level Speculation
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
Coherence decoupling: making use of incoherence
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Unbounded Transactional Memory
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Virtualizing Transactional Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
Characterization of TCC on Chip-Multiprocessors
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Compiler and runtime support for efficient software transactional memory
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Bulk Disambiguation of Speculative Threads in Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Unbounded page-based transactional memory
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Computer Architecture, Fourth Edition: A Quantitative Approach
Computer Architecture, Fourth Edition: A Quantitative Approach
Performance pathologies in hardware transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
MetaTM/TxLinux: transactional memory for an operating system
Proceedings of the 34th annual international symposium on Computer architecture
The transactional memory / garbage collection analogy
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
JudoSTM: A Dynamic Binary-Rewriting Approach to Software Transactional Memory
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Kicking the tires of software transactional memory: why the going gets tough
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
TokenTM: Efficient Execution of Large Transactions with Hardware Transactional Memory
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Flexible Decoupled Transactional Memory Support
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Early experience with a commercial hardware transactional memory implementation
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Extending concurrency of transactional memory programs by using value prediction
Proceedings of the 6th ACM conference on Computing frontiers
EazyHTM: eager-lazy hardware transactional memory
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
The Art of Multiprocessor Programming
The Art of Multiprocessor Programming
RETCON: transactional repair without replay
Proceedings of the 37th annual international symposium on Computer architecture
Transactional Memory, 2nd Edition
Transactional Memory, 2nd Edition
Hardware transactional memory for GPU architectures
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
This paper explores data speculation for improving the performance of Hardware Transactional Memory (HTM). We attempt to reduce transactional conflicts by decoupling them from cache coherence conflicts; many HTMs do not distinguish between transactional conflicts and coherence conflicts, leading to false transactional conflicts. We also attempt to mitigate the effects of coherence conflicts by using value prediction in transactions. We show that coherence decoupling and value prediction in transactions complement each other, because they both speculate on data in ways that are infeasible in the absence of HTM support. As a demonstration of how data speculation can improve performance, we introduce DPTM, a best-effort HTM that mitigates the effects of false sharing at the cache line level. DPTM does not alter the underlying cache coherence protocol, and requires only minor, processor-local, modifications. We evaluate DPTM against a baseline best-effort HTM, and compare it with data restructuring by padding, the most commonly used method to avoid false sharing. Our experiments show that DPTM can dramatically improve performance in the presence of false sharing without degrading performance in its absence, and consistently performs better than restructuring by padding.