On optimistic methods for concurrency control
ACM Transactions on Database Systems (TODS)
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
Removing architectural bottlenecks to the scalability of speculative parallelization
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Estimation of the likelihood of capacitive coupling noise
Proceedings of the 39th annual Design Automation Conference
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Token coherence: decoupling performance and correctness
Proceedings of the 30th annual international symposium on Computer architecture
The Soft Error Problem: An Architectural Perspective
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Unbounded Transactional Memory
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Improving Multiple-CMP Systems Using Token Coherence
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Virtualizing Transactional Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
Advanced contention management for dynamic software transactional memory
Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing
Thread-Level Speculation on a CMP can be energy efficient
Proceedings of the 19th annual international conference on Supercomputing
Design and analysis of an NoC architecture from performance, reliability and energy perspective
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Towards on-chip fault-tolerant communication
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
Exploring Fault-Tolerant Network-on-Chip Architectures
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
Bulk Disambiguation of Speculative Threads in Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Coherence Ordering for Ring-based Chip Multiprocessors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Proximity-aware directory-based coherence for multi-core processor architectures
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
An effective hybrid transactional memory system with strong isolation guarantees
Proceedings of the 34th annual international symposium on Computer architecture
Performance pathologies in hardware transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
BulkSC: bulk enforcement of sequential consistency
Proceedings of the 34th annual international symposium on Computer architecture
An Effective Starvation Avoidance Mechanism to Enhance the Token Coherence Protocol
PDP '07 Proceedings of the 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing
A Scalable, Non-blocking Approach to Transactional Memory
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Error Detection via Online Checking of Cache Coherence with Token Coherence Signatures
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Verifying safety of a token coherence implementation by parametric compositional refinement
VMCAI'05 Proceedings of the 6th international conference on Verification, Model Checking, and Abstract Interpretation
Modeling and analysis of crosstalk noise in coupled RLC interconnects
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
A Dynamically Adaptable Hardware Transactional Memory
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
ZEBRA: a data-centric, hybrid-policy hardware transactional memory design
Proceedings of the international conference on Supercomputing
Proceedings of the 38th annual international symposium on Computer architecture
BlockChop: dynamic squash elimination for hybrid processor architecture
Proceedings of the 39th Annual International Symposium on Computer Architecture
Transactional prefetching: narrowing the window of contention in hardware transactional memory
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
BulkCommit: scalable and fast commit of atomic blocks in a lazy multiprocessor environment
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
SI-TM: reducing transactional memory abort rates through snapshot isolation
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
In a hardware transactional memory system with lazy versioning and lazy conflict detection, the process of transaction commit can emerge as a bottleneck. This is especially true for a large-scale distributed memory system where multiple transactions may attempt to commit simultaneously and coordination is required before allowing commits to proceed in parallel. In this paper, we propose novel algorithms to implement commit that are more scalable in terms of delay and are free of deadlocks/livelocks. We show that these algorithms have similarities with the token cache coherence concept and leverage these similarities to extend the algorithms to handle message loss and starvation scenarios. The proposed algorithms improve upon the state-of-the-art by yielding up to a 7X reduction in commit delay and up to a 48X reduction in network messages for commit. These translate into overall performance improvements of up to 66% (for synthetic workloads with average transaction length of 200 cycles), 35% (for average transaction length of 1000 cycles), and 8% (for average transaction length of 4000 cycles). For a small group of multi-threaded programs with frequent transaction commits, improvements of up to 8% were observed for a 32-node simulation.