Compiler algorithms for synchronization
IEEE Transactions on Computers
Efficient and correct execution of parallel programs that share memory
ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient synchronization primitives for large-scale cache-coherent multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Introduction to algorithms
A methodology for implementing highly concurrent data structures
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Compiler algorithms for event variable synchronization
ICS '91 Proceedings of the 5th international conference on Supercomputing
What are race conditions?: Some issues and formalizations
ACM Letters on Programming Languages and Systems (LOPLAS)
Optimal strategies for spinning and blocking
Journal of Parallel and Distributed Computing
The hierarchical task graph as a universal intermediate representation
International Journal of Parallel Programming
Optimizing parallel programs with explicit synchronization
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Synchronization and communication in the T3E multiprocessor
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
POPL '98 Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Empirical studies of competitve spinning for a shared-memory multiprocessor
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Weak ordering—a new definition
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The impact of synchronization and granularity on parallel systems
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Proving Liveness Properties of Concurrent Programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Hiding Relaxed Memory Consistency with a Compiler
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Operating System Concepts
Structure of Computers and Computations
Structure of Computers and Computations
Proceedings of the 2003 ACM SIGPLAN international workshop on Types in languages design and implementation
Parallel Program Graphs and their Classification
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Automatic fence insertion for shared memory multiprocessing
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Fast Synchronization on Scalable Cache-Coherent Multiprocessors using Hybrid Primitives
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Mechanisms for efficient shared-memory, lock-based synchronization
Mechanisms for efficient shared-memory, lock-based synchronization
Thin locks: featherweight Synchronization for Java
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Static analysis of atomicity for programs with non-blocking synchronization
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Concurrency analysis for parallel programs with textually aligned barriers
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Proceedings of the 34th annual international symposium on Computer architecture
Techniques for efficient placement of synchronization primitives
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Synchronization optimizations for efficient execution on multi-cores
Proceedings of the 23rd international conference on Supercomputing
Proceedings of the international conference on Supercomputing
Hi-index | 0.00 |
Emergence of chip multiprocessors has created a need for exploitation of beyond DOALL-type thread-level parallelism (TLP). This calls for development of efficient thread synchronization techniques to exploit TLP in general parallel programs with dependences. For this, several thread synchronization techniques have been proposed in the past. However, these limit the exploitation of fine-grain TLP due to large run-time overhead. Furthermore, the existing approaches can potentially result in (i) deadlocks between the different threads and (ii) non-deterministic run-time execution behavior as these techniques are oblivious of the underlying memory model. In this paper, we propose lightweight lock-free thread synchronization methods to exploit TLP in general parallel programs with dependences. Each synchronization method intrinsically guarantees the following in a multithreaded program: (a) sequential consistency, (b) atomicity of writes to the shared synchronization construct and (c) absence of deadlocks. This reduces the programming effort considerably, thereby easing the development of software for multithreaded systems. For each method we formally prove that there cannot occur a deadlock between the different threads. This obviates the cumbersome and time-consuming process of detecting and eliminating deadlocks from the programmer. Experiments show that our synchronization methods incur a minimal overhead of 7.16% on an average. Further, we achieve performance speedups upto 3.39x on kernels extracted from the industry standard SPEC OMPM 2001 benchmarks, on a dedicated Intel® Xeon® 2.78 GHz 4-way multiprocessor.