Amortized efficiency of list update and paging rules
Communications of the ACM
A fast mutual exclusion algorithm
ACM Transactions on Computer Systems (TOCS)
Firefly: A Multiprocessor Workstation
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
The performance implications of thread management alternatives for shared-memory multiprocessors
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A methodology for implementing highly concurrent data structures
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
The effect of context switches on cache performance
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Competitive randomized algorithms for non-uniform problems
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
Additional comments on a problem in concurrent programming control
Communications of the ACM
Solution of a problem in concurrent programming control
Communications of the ACM
The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
WAITING ALGORITHMS FOR SYNCHRONIZATION IN LARGE-SCALE MULTIPROCESSORS
WAITING ALGORITHMS FOR SYNCHRONIZATION IN LARGE-SCALE MULTIPROCESSORS
Network locality at the scale of processes
SIGCOMM '91 Proceedings of the conference on Communications architecture & protocols
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Scheduler activations: effective kernel support for the user-level management of parallelism
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Network locality at the scale of processes
ACM Transactions on Computer Systems (TOCS)
Operating system support for parallel programming on RP3
IBM Journal of Research and Development
Scheduler activations: effective kernel support for the user-level management of parallelism
ACM Transactions on Computer Systems (TOCS)
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Waiting algorithms for synchronization in large-scale multiprocessors
ACM Transactions on Computer Systems (TOCS)
Using scheduler information to achieve optimal barrier synchronization performance
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
Spin-block synchronization algorithm in the shared memory multiprocessor system
ACM SIGOPS Operating Systems Review
Reactive synchronization algorithms for multiprocessors
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
A note on structured interrupts
ACM SIGOPS Operating Systems Review
High performance synchronization algorithms for multiprogrammed multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Reducing TLB and memory overhead using online superpage promotion
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Effective distributed scheduling of parallel workloads
Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Scheduler-conscious synchronization
ACM Transactions on Computer Systems (TOCS)
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Combining funnels: a new twist on an old tale…
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Scheduling with implicit information in distributed systems
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Scalable concurrent priority queue algorithms
Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing
ICS '99 Proceedings of the 13th international conference on Supercomputing
A study of locking objects with bimodal fields
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Adaptive two-level thread management for fast MPI execution on shared memory machines
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Let caches decay: reducing leakage energy via exploitation of cache generational behavior
ACM Transactions on Computer Systems (TOCS)
Non-blocking timeout in scalable queue-based spin locks
Proceedings of the twenty-first annual symposium on Principles of distributed computing
WOSP '02 Proceedings of the 3rd international workshop on Software and performance
International Journal of Parallel Programming
Reducing Waiting Costs in User-Level Communication
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Barrier Synchronization on a Loaded SMP Using Two-Phase Waiting Algorithms
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Adaptive Disk Spin-down Policies for Mobile Computers
MLICS '95 Proceedings of the 2nd Symposium on Mobile and Location-Independent Computing
Improving server software support for simultaneous multithreaded processors
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Thread prioritization: a thread scheduling mechanism for multiple-context parallel processors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Two Adaptive Hybrid Cache Coherency Protocols
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Pipelined functional tree accesses and updates: scheduling, synchronization, caching and coherence
Journal of Functional Programming
Java server performance: a case study of building efficient, scalable Jvms
IBM Systems Journal
A softerware monitor for shared-memory multiprocessor computers
Software—Practice & Experience
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Spin Detection Hardware for Improved Management of Multithreaded Systems
IEEE Transactions on Parallel and Distributed Systems
Lightweight lock-free synchronization methods for multithreading
Proceedings of the 20th annual international conference on Supercomputing
Self-tuning reactive diffracting trees
Journal of Parallel and Distributed Computing
Efficient self-tuning spin-locks using competitive analysis
Journal of Systems and Software
Towards scalable multiprocessor virtual machines
VM'04 Proceedings of the 3rd conference on Virtual Machine Research And Technology Symposium - Volume 3
Using continuations to build a user-level threads library
MSYM'93 Proceedings of the 3rd conference on USENIX MACH III Symposium - Volume 1
Adaptive modem connection lifetimes
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Communications of the ACM - Security in the Browser
Fast switching of threads between cores
ACM SIGOPS Operating Systems Review
The multikernel: a new OS architecture for scalable multicore systems
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Smartlocks: lock acquisition scheduling for self-aware synchronization
Proceedings of the 7th international conference on Autonomic computing
An analysis of Linux scalability to many cores
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Preemption adaptivity in time-published queue-based spin locks
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
VirtuOS: an operating system with kernel virtualization
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
Hi-index | 0.00 |
A common operation in multiprocessor programs is acquiring a lock to protect access to shared data. Typically, the requesting thread is blocked if the lock it needs is held by another thread. The cost of blocking one thread and activating another can be a substantial part of program execution time. Alternatively, the thread could spin until the lock is free, or spin for a while and then block. This may avoid context-switch overhead, but processor cycles may be wasted in unproductive spinning. This paper studies seven strategies for determining whether and how long to spin before blocking. Of particular interest are competitive strategies, for which the performance can be shown to be no worse than some constant factor times an optimal off-line strategy. The performance of five competitive strategies is compared with that of always blocking, always spinning, or using the optimal off-line algorithm. Measurements of lock-waiting time distributions for five parallel programs were used to compare the cost of synchronization under all the strategies. Additional measurements of elapsed time for some of the programs and strategies allowed assessment of the impact of synchronization strategy on overall program performance. Both types of measurements indicate that the standard blocking strategy performs poorly compared to mixed strategies. Among the mixed strategies studied, adaptive algorithms perform better than non-adaptive ones.