ACM Transactions on Database Systems (TODS)
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Performance issues in non-blocking synchronization on shared-memory multiprocessors
PODC '92 Proceedings of the eleventh annual ACM symposium on Principles of distributed computing
A methodology for implementing highly concurrent data objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Shoring up persistent applications
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
ARB: A Hardware Mechanism for Dynamic Reordering of Memory References
IEEE Transactions on Computers
Performance analysis using very large memory on the 64-bit AlphaServer system
Digital Technical Journal
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Efficient synchronization: let them eat QOLB
Proceedings of the 24th annual international symposium on Computer architecture
The SGI Origin: a ccNUMA highly scalable server
Proceedings of the 24th annual international symposium on Computer architecture
Concurrency control: methods, performance, and analysis
ACM Computing Surveys (CSUR)
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
On optimistic methods for concurrency control
ACM Transactions on Database Systems (TODS)
Implementation of precise interrupts in pipelined processors
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
A study of common pitfalls in simple multi-threaded programs
Proceedings of the thirty-first SIGCSE technical symposium on Computer science education
On the value locality of store instructions
Proceedings of the 27th annual international symposium on Computer architecture
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Concurrent reading and writing
Communications of the ACM
Multiple Reservations and the Oklahoma Update
IEEE Parallel & Distributed Technology: Systems & Technology
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Handling of packet dependencies: a critical issue for highly parallel network processors
CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Lock reservation: Java locks can mostly do without atomic operations
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Speculative synchronization: applying thread-level speculation to explicitly parallel applications
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Enhancing software reliability with speculative threads
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Toward efficient and robust software speculative parallelization on multiprocessors
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Improving server software support for simultaneous multithreaded processors
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Using Interaction Costs for Microarchitectural Bottleneck Analysis
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
Interaction cost and shotgun profiling
ACM Transactions on Architecture and Code Optimization (TACO)
Coherence decoupling: making use of incoherence
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Speculative Incoherent Cache Protocols
IEEE Micro
Partially ordered epochs for thread-level speculation
Proceedings of the 2nd conference on Computing frontiers
Revocable locks for non-blocking programming
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Virtualizing Transactional Memory
Proceedings of the 32nd annual international symposium on Computer Architecture
Design Space Exploration of a Software Speculative Parallelization Scheme
IEEE Transactions on Parallel and Distributed Systems
Characterization of TCC on Chip-Multiprocessors
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Store Memory-Level Parallelism Optimizations for Commercial Applications
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
SableSpMT: a software framework for analysing speculative multithreading in Java
PASTE '05 Proceedings of the 6th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering
Optimizing memory transactions
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Compiler and runtime support for efficient software transactional memory
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Hardware tansactional memory support for lightweight dynamic language evolution
Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications
Executing Java programs with transactional memory
Science of Computer Programming - Special issue: Synchronization and concurrency in object-oriented languages
Nested transactional memory: model and architecture sketches
Science of Computer Programming - Special issue: Synchronization and concurrency in object-oriented languages
Starvation-free commit arbitration policies for transactional memory systems
ACM SIGARCH Computer Architecture News
Making the fast case common and the uncommon case simple in unbounded transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
MetaTM/TxLinux: transactional memory for an operating system
Proceedings of the 34th annual international symposium on Computer architecture
An integrated hardware-software approach to flexible transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
Hardware atomicity for reliable software speculation
Proceedings of the 34th annual international symposium on Computer architecture
Mechanisms for store-wait-free multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Pipelined Execution of Critical Sections Using Software-Controlled Caching in Network Processors
Proceedings of the International Symposium on Code Generation and Optimization
Store Atomicity for Transactional Memory
Electronic Notes in Theoretical Computer Science (ENTCS)
Contention resolution with heterogeneous job sizes
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
The mechanics of in-kernel synchronization for a scalable microkernel
ACM SIGOPS Operating Systems Review
TxLinux: using and managing hardware transactional memory in an operating system
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Hiding the misprediction penalty of a resource-efficient high-performance processor
ACM Transactions on Architecture and Code Optimization (TACO)
Incrementally parallelizing database transactions with thread-level speculation
ACM Transactions on Computer Systems (TOCS)
Journal of Parallel and Distributed Computing
General and efficient locking without blocking
Proceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08)
Proceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08)
Is the optimism in optimistic concurrency warranted?
HOTOS'07 Proceedings of the 11th USENIX workshop on Hot topics in operating systems
A case for low-complexity MP architectures
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Proceedings of the 5th conference on Computing frontiers
Adaptive transaction scheduling for transactional memory systems
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
TxLinux and MetaTM: transactional memory and the operating system
Communications of the ACM - Enterprise information integration: and other tools for merging data
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Contention-aware scheduler: unlocking execution parallelism in multithreaded java programs
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Critical sections: re-emerging scalability concerns for database storage engines
Proceedings of the 4th international workshop on Data management on new hardware
Maximum benefit from a minimal HTM
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Early experience with a commercial hardware transactional memory implementation
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Accelerating critical section execution with asymmetric multi-core architectures
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
DracoSTM: a practical C++ approach to software transactional memory
LCSD '07 Proceedings of the 2007 Symposium on Library-Centric Software Design
A runtime system for software lock elision
Proceedings of the 4th ACM European conference on Computer systems
Optimistic concurrency for clusters via speculative locking
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
InvisiFence: performance-transparent memory ordering in conventional multiprocessors
Proceedings of the 36th annual international symposium on Computer architecture
ECMon: exposing cache events for monitoring
Proceedings of the 36th annual international symposium on Computer architecture
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Early experience with a commercial hardware transactional memory implementation
Early experience with a commercial hardware transactional memory implementation
A real system evaluation of hardware atomicity for software speculation
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Evaluation of AMD's advanced synchronization facility within a complete transactional memory stack
Proceedings of the 5th European conference on Computer systems
Lock elision for read-only critical sections in Java
PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Simplifying concurrent algorithms by exploiting hardware transactional memory
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Contention-sensitive data structures and algorithms
DISC'09 Proceedings of the 23rd international conference on Distributed computing
RETCON: transactional repair without replay
Proceedings of the 37th annual international symposium on Computer architecture
Modeling critical sections in Amdahl's law and its implications for multicore design
Proceedings of the 37th annual international symposium on Computer architecture
Journal of Parallel and Distributed Computing
Adaptive locks: Combining transactions and locks for efficient concurrency
Journal of Parallel and Distributed Computing
Transactional memory should be an implementation technique, not a programming interface
HotPar'09 Proceedings of the First USENIX conference on Hot topics in parallelism
On the effectiveness of speculative and selective memory fences
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Architectural Support for Fair Reader-Writer Locking
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Circuit design of a dual-versioning L1 data cache for optimistic concurrency
Proceedings of the 21st edition of the great lakes symposium on Great lakes symposium on VLSI
Adding concurrency in python using a commercial processor's hardware transactional memory support
ACM SIGARCH Computer Architecture News
Data-race exceptions have benefits beyond the memory model
Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Optimizing hybrid transactional memory: the importance of nonspeculative operations
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Transactional conflict decoupling and value prediction
Proceedings of the international conference on Supercomputing
Proceedings of the 38th annual international symposium on Computer architecture
ICDCIT'10 Proceedings of the 6th international conference on Distributed Computing and Internet Technology
Bottleneck identification and scheduling in multithreaded applications
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Speculative optimizations for parallel programs on multicores
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Complementing user-level coarse-grain parallelism with implicit speculative parallelism
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Circuit design of a dual-versioning L1 data cache
Integration, the VLSI Journal
Static analysis and compiler design for idempotent processing
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Hardware support for enforcing isolation in lock-based parallel programs
Proceedings of the 26th ACM international conference on Supercomputing
A case for including transactions in OpenMP II: hardware transactional memory
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Automated concurrency-bug fixing
OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Starvation-free transactional memory-system protocols
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Pessimistic software lock-elision
DISC'12 Proceedings of the 26th international conference on Distributed Computing
Using hardware transactional memory to correct and simplify and readers-writer lock algorithm
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Programming with hardware lock elision
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Transactional Memory Architecture and Implementation for IBM System Z
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 27th international ACM conference on International conference on supercomputing
Exploring memory consistency for massively-threaded throughput-oriented processors
Proceedings of the 40th Annual International Symposium on Computer Architecture
WeeFence: toward making fences free in TSO
Proceedings of the 40th Annual International Symposium on Computer Architecture
Robust architectural support for transactional memory in the power architecture
Proceedings of the 40th Annual International Symposium on Computer Architecture
Criticality stacks: identifying critical threads in parallel programs using synchronization behavior
Proceedings of the 40th Annual International Symposium on Computer Architecture
Transparent and energy-efficient speculation on NUMA architectures for embedded MPSoCs
Proceedings of the First International Workshop on Many-core Embedded Systems
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
SI-TM: reducing transactional memory abort rates through snapshot isolation
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Eliminating global interpreter locks in ruby through hardware transactional memory
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Scaling existing lock-based applications with lock elision
Communications of the ACM
Scaling Existing Lock-based Applications with Lock Elision
Queue - Performance
Hi-index | 0.02 |
Serialization of threads due to critical sections is a fundamental bottleneck to achieving high performance in multithreaded programs. Dynamically, such serialization may be unnecessary because these critical sections could have safely executed concurrently without locks. Current processors cannot fully exploit such parallelism because they do not have mechanisms to dynamically detect such false inter-thread dependences.We propose Speculative Lock Elision (SLE), a novel micro-architectural technique to remove dynamically unnecessary lock-induced serialization and enable highly concurrent multithreaded execution. The key insight is that locks do not always have to be acquired for a correct execution. Synchronization instructions are predicted as being unnecessary and elided. This allows multiple threads to concurrently execute critical sections protected by the same lock. Misspeculation due to inter-thread data conflicts is detected using existing cache mechanisms and rollback is used for recovery. Successful speculative elision is validated and committed without acquiring the lock.SLE can be implemented entirely in microarchitecture without instruction set support and without system-level modifications, is transparent to programmers, and requires only trivial additional hardware support. SLE can provide programmers a fast path to writing correct high-performance multithreaded programs.