Portable programs for parallel processors
Portable programs for parallel processors
The fuzzy barrier: a mechanism for high speed synchronization of processors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Apologizing versus asking permission: optimistic concurrency control for abstract data types
ACM Transactions on Database Systems (TODS)
Detecting violations of sequential consistency
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
A methodology for implementing highly concurrent data objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The SPARC architecture manual (version 9)
The SPARC architecture manual (version 9)
Software caching and computation migration in Olden
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
An evaluation of memory consistency models for shared-memory systems with ILP processors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Scheduler-conscious synchronization
ACM Transactions on Computer Systems (TOCS)
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Data speculation support for a chip multiprocessor
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Clustered speculative multithreaded processors
ICS '99 Proceedings of the 13th international conference on Supercomputing
A Chip-Multiprocessor Architecture with Speculative Multithreading
IEEE Transactions on Computers
On optimistic methods for concurrency control
ACM Transactions on Database Systems (TODS)
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ACM Transactions on Computer Systems (TOCS)
A scalable approach to thread-level speculation
Proceedings of the 27th annual international symposium on Computer architecture
First-class user-level threads
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Proceedings of the 27th annual international symposium on Computer architecture
Speculative lock elision: enabling highly concurrent multithreaded execution
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Multiple Reservations and the Oklahoma Update
IEEE Parallel & Distributed Technology: Systems & Technology
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
Parallel Programming with Polaris
Computer
The MIPS R10000 Superscalar Microprocessor
IEEE Micro
A Unified Formalization of Four Shared-Memory Models
IEEE Transactions on Parallel and Distributed Systems
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
An Direct-Execution Framework for Fast and Accurate Simulation of Superscalar Processors
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
A Mechanism for Speculative Memory Accesses Following Synchronizing Operations
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Using thread-level speculation to simplify manual parallelization
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Toward efficient and robust software speculative parallelization on multiprocessors
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
Programming with transactional coherence and consistency (TCC)
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Coherence decoupling: making use of incoherence
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Nonblocking memory management support for dynamic-sized data structures
ACM Transactions on Computer Systems (TOCS)
Partially ordered epochs for thread-level speculation
Proceedings of the 2nd conference on Computing frontiers
Composable memory transactions
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Exposing speculative thread parallelism in SPEC2000
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Design Space Exploration of a Software Speculative Parallelization Scheme
IEEE Transactions on Parallel and Distributed Systems
Toward a theory of transactional contention managers
Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing
Energy reduction in multiprocessor systems using transactional memory
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Characterization of TCC on Chip-Multiprocessors
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
ReSlice: Selective Re-Execution of Long-Retired Misspeculated Instructions Using Forward Slicing
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Architectural Support for Software Transactional Memory
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Executing Java programs with transactional memory
Science of Computer Programming - Special issue: Synchronization and concurrency in object-oriented languages
Starvation-free commit arbitration policies for transactional memory systems
ACM SIGARCH Computer Architecture News
A hardware/software framework for supporting transactional memory in a MPSoC environment
ACM SIGARCH Computer Architecture News
Making the fast case common and the uncommon case simple in unbounded transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
An integrated hardware-software approach to flexible transactional memory
Proceedings of the 34th annual international symposium on Computer architecture
Mechanisms for store-wait-free multiprocessors
Proceedings of the 34th annual international symposium on Computer architecture
Pipelined Execution of Critical Sections Using Software-Controlled Caching in Network Processors
Proceedings of the International Symposium on Code Generation and Optimization
Speculative optimization using hardware-monitored guarded regions for java virtual machines
Proceedings of the 3rd international conference on Virtual execution environments
Incrementally parallelizing database transactions with thread-level speculation
ACM Transactions on Computer Systems (TOCS)
Journal of Parallel and Distributed Computing
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Optimistic concurrency for clusters via speculative locking
SYSTOR '09 Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference
InvisiFence: performance-transparent memory ordering in conventional multiprocessors
Proceedings of the 36th annual international symposium on Computer architecture
ECMon: exposing cache events for monitoring
Proceedings of the 36th annual international symposium on Computer architecture
RETCON: transactional repair without replay
Proceedings of the 37th annual international symposium on Computer architecture
Modeling critical sections in Amdahl's law and its implications for multicore design
Proceedings of the 37th annual international symposium on Computer architecture
Parallelization libraries: Characterizing and reducing overheads
ACM Transactions on Architecture and Code Optimization (TACO)
EURASIP Journal on Embedded Systems
Efficient partial roll-backing mechanism for transactional memory systems
Transactions on high-performance embedded architectures and compilers III
Software thread level speculation for the java language and virtual machine environment
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Speculative optimizations for parallel programs on multicores
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Complementing user-level coarse-grain parallelism with implicit speculative parallelism
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Distributed transactional memory for metric-space networks
DISC'05 Proceedings of the 19th international conference on Distributed Computing
Software transactional memory validation – time and space considerations
Transactions on High-Performance Embedded Architectures and Compilers IV
Starvation-free transactional memory-system protocols
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Implicit transactional memory in kilo-instruction multiprocessors
ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Unifying thread-level speculation and transactional memory
Proceedings of the 13th International Middleware Conference
Automatic speculative parallelization of loops using polyhedral dependence analysis
Proceedings of the First International Workshop on Code OptimiSation for MultI and many Cores
WeeFence: toward making fences free in TSO
Proceedings of the 40th Annual International Symposium on Computer Architecture
Criticality stacks: identifying critical threads in parallel programs using synchronization behavior
Proceedings of the 40th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
Barriers, locks, and flags are synchronizing operations widely used programmers and parallelizing compilers to produce race-free parallel programs. Often times, these operations are placed suboptimally, either because of conservative assumptions about the program, or merely for code simplicity.We propose Speculative Synchronization, which applies the philosophy behind Thread-Level Speculation (TLS) to explicitly parallel applications. Speculative threads execute past active barriers, busy locks, and unset flags instead of waiting. The proposed hardware checks for conflicting accesses and, if a violation is detected, offending speculative thread is rolled back to the synchronization point and restarted on the fly. TLS's principle of always keeping a safe thread is key to our proposal: in any speculative barrier, lock, or flag, the existence of one or more safe threads at all times guarantees forward progress, even in the presence of access conflicts or speculative buffer overflow. Our proposal requires simple hardware and no programming effort. Furthermore, it can coexist with conventional synchronization at run time.We use simulations to evaluate 5 compiler- and hand-parallelized applications. Our results show a reduction in the time lost to synchronization of 34% on average, and a reduction in overall program execution time of 7.4% on average.