Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
A fast mutual exclusion algorithm
ACM Transactions on Computer Systems (TOCS)
VLSI assist for a multiprocessor
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Firefly: A Multiprocessor Workstation
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
An evaluation of directory schemes for cache coherence
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Impossibility and universality results for wait-free synchronization
PODC '88 Proceedings of the seventh annual ACM Symposium on Principles of distributed computing
Efficient synchronization primitives for large-scale cache-coherent multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Adaptive backoff synchronization techniques
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
The performance implications of thread management alternatives for shared-memory multiprocessors
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Ethernet: distributed packet switching for local computer networks
Communications of the ACM
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
The performance of an object-oriented threads package
OOPSLA/ECOOP '90 Proceedings of the European conference on object-oriented programming on Object-oriented programming systems, languages, and applications
Algorithms for scalable synchronization on shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Parallel programs and background load: efficiency studies with the PAR-Bench system
ICS '91 Proceedings of the 5th international conference on Supercomputing
Scalable reader-writer synchronization for shared-memory multiprocessors
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Characterizing memory hot spots in a shared memory MIMD machine
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Performance issues in non-blocking synchronization on shared-memory multiprocessors
PODC '92 Proceedings of the eleventh annual ACM symposium on Principles of distributed computing
Low contention load balancing on large-scale multiprocessors
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
The design and implementation of HoME
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Scheduling in parallel systems with a hierarchical organization of tasks
ICS '92 Proceedings of the 6th international conference on Supercomputing
A dynamic processor allocation policy for multiprogrammed shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
Waiting algorithms for synchronization in large-scale multiprocessors
ACM Transactions on Computer Systems (TOCS)
Procs and locks: a portable multiprocessing platform for standard ML of New Jersey
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons
ACM Computing Surveys (CSUR)
A methodology for implementing highly concurrent data objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Recent trends in experimental operating systems research
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Fast, scalable synchronization with minimal hardware support
PODC '93 Proceedings of the twelfth annual ACM symposium on Principles of distributed computing
Transactional memory: architectural support for lock-free data structures
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Contention in shared memory algorithms
STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
Hot spot analysis in large scale shared memory multiprocessors
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Diffracting trees (preliminary version)
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
An approach to scalability study of shared memory parallel systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Spin-Lock Synchronization on the Butterfly and KSR1
IEEE Parallel & Distributed Technology: Systems & Technology
Spin-block synchronization algorithm in the shared memory multiprocessor system
ACM SIGOPS Operating Systems Review
Time bounds for mutual exclusion and related problems
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Reactive synchronization algorithms for multiprocessors
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
A performance evaluation of lock-free synchronization protocols
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
Using k-exclusion to implement resilient, scalable shared objects (extended abstract)
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
A Hierarchical Task Queue Organization for Shared-Memory Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Distributed Hardwired Barrier Synchronization for Scalable Multiprocessor Clusters
IEEE Transactions on Parallel and Distributed Systems
High performance synchronization algorithms for multiprogrammed multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
ACM Transactions on Computer Systems (TOCS)
Elimination trees and the construction of pools and stacks: preliminary version
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
The communication requirements of mutual exclusion
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Distributed Shared Abstractions (DSA) on Multiprocessors
IEEE Transactions on Software Engineering
ACM Transactions on Computer Systems (TOCS)
Scheduler-conscious synchronization
ACM Transactions on Computer Systems (TOCS)
An efficient recovery-based spin lock protocol for preemptive shared-memory multiprocessors
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Simple, fast, and practical non-blocking and blocking concurrent queue algorithms
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Efficient synchronization: let them eat QOLB
Proceedings of the 24th annual international symposium on Computer architecture
A Prioritized Multiprocessor Spin Lock
IEEE Transactions on Parallel and Distributed Systems
Contention in shared memory algorithms
Journal of the ACM (JACM)
Combining funnels: a new twist on an old tale…
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Thin locks: featherweight synchronization for Java
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A time complexity lower bound for randomized implementations of some shared objects
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
An Application-Driven Study of Parallel System Overheads and Network Bandwidth Requirements
IEEE Transactions on Parallel and Distributed Systems
Compile/run-time support for threaded MPI execution on multiprogrammed shared memory machines
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
A simple local-spin group mutual exclusion algorithm
Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing
Scalable concurrent priority queue algorithms
Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing
Evaluating synchronization on shared address space multiprocessors: methodology and performance
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
An efficient meta-lock for implementing ubiquitous synchronization
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
A study of locking objects with bimodal fields
Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Adaptive two-level thread management for fast MPI execution on shared memory machines
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Program transformation and runtime support for threaded MPI execution on shared-memory machines
ACM Transactions on Programming Languages and Systems (TOPLAS)
System-on-a-chip processor synchronization support in hardware
Proceedings of the conference on Design, automation and test in Europe
Scalable queue-based spin locks with timeout
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
An improved lower bound for the time complexity of mutual exclusion
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
Using Cohort Scheduling to Enhance Server Performance (Extended Abstract)
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Improved implementations of binary universal operations
Journal of the ACM (JACM)
A system-on-a-chip lock cache with task preemption support
CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
A Simple Local-Spin Group Mutual Exclusion Algorithm
IEEE Transactions on Parallel and Distributed Systems
Nonatomic mutual exclusion with local spinning
Proceedings of the twenty-first annual symposium on Principles of distributed computing
Non-blocking timeout in scalable queue-based spin locks
Proceedings of the twenty-first annual symposium on Principles of distributed computing
Affinity scheduling of unbalanced workloads
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Transactional lock-free execution of lock-based programs
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A space- and time-efficient local-spin spin lock
Information Processing Letters
International Journal of Parallel Programming
Multiple Reservations and the Oklahoma Update
IEEE Parallel & Distributed Technology: Systems & Technology
Characterizing the Performance of Algorithms for Lock-Free Objects
IEEE Transactions on Computers
Design Considerations for Shared Memory Multiprocessor Message Systems
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Performance of Barrier Synchronization Methods in a Multiaccess Network
IEEE Transactions on Parallel and Distributed Systems
A Circular List-Based Mutual Exclusion Scheme for Large Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Fast and Scalable Mutual Exclusion
Proceedings of the 13th International Symposium on Distributed Computing
Adaptive Mutual Exclusion with Local Spinning
DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Using Cohort-Scheduling to Enhance Server Performance
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Efficient synchronization for nonuniform communication architectures
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
An improved lower bound for the time complexity of mutual exclusion
Distributed Computing - Special issue: Selected papers from PODC '01
Inferential queueing and speculative push for reducing critical communication latencies
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Thread prioritization: a thread scheduling mechanism for multiple-context parallel processors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Abstracting network characteristics and locality properties of parallel systems
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Hierarchical Backoff Locks for Nonuniform Communication Architectures
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Local-spin Mutual Exclusion Using Fetch-and-\phi Primitives
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Adaptive and efficient abortable mutual exclusion
Proceedings of the twenty-second annual symposium on Principles of distributed computing
Distributed-sum termination detection supporting multithreaded execution
Parallel Computing
Backoff Protocols for Distributed Mutual Exclusion and Ordering
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Managing Concurrent Access for Shared Memory Active Messages
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Shared-memory mutual exclusion: major research trends since 1986
Distributed Computing - Papers in celebration of the 20th anniversary of PODC
Thin locks: featherweight Synchronization for Java
ACM SIGPLAN Notices - Best of PLDI 1979-1999
A scalable lock-free stack algorithm
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Java server performance: a case study of building efficient, scalable Jvms
IBM Systems Journal
A new fast-path mechanism for mutual exclusion
Distributed Computing
Counting networks with arbitrary fan-out
Distributed Computing
Advanced contention management for dynamic software transactional memory
Proceedings of the twenty-fourth annual ACM symposium on Principles of distributed computing
Using local-spin k-exclusion algorithms to improve wait-free object implementations
Distributed Computing
Adaptive solutions to the mutual exclusion problem
Distributed Computing
Fast synchronization on shared-memory multiprocessors: An architectural approach
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Inferential queueing and speculative push
International Journal of Parallel Programming - Special issue I: The 17th annual international conference on supercomputing (ICS'03)
Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip
Proceedings of the 3rd conference on Computing frontiers
Power/performance hardware optimization for synchronization intensive applications in MPSoCs
Proceedings of the conference on Design, automation and test in Europe: Proceedings
An efficient synchronization technique for multiprocessor systems on-chip
MEDEA '05 Proceedings of the 2005 workshop on MEmory performance: DEaling with Applications , systems and architecture
Distributed computing using Java: a comparison of two server designs
Journal of Systems Architecture: the EUROMICRO Journal
A tight bound on remote reference time complexity of mutual exclusion in the read-modify-write model
Journal of Parallel and Distributed Computing
On the energy efficiency of synchronization primitives for shared-memory single-chip multiprocessors
Proceedings of the 17th ACM Great Lakes symposium on VLSI
A generic local-spin fetch-and-φ-based mutual exclusion algorithm
Journal of Parallel and Distributed Computing
Self-tuning reactive diffracting trees
Journal of Parallel and Distributed Computing
Efficient self-tuning spin-locks using competitive analysis
Journal of Systems and Software
Proceedings of the 21st annual international conference on Supercomputing
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Light-weight synchronization for inter-processor communication acceleration on embedded MPSoCs
CASES '07 Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems
Performance of memory reclamation for lockless synchronization
Journal of Parallel and Distributed Computing
FastForward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Tight RMR lower bounds for mutual exclusion and other problems
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Scientific Programming
The Weakest Failure Detector for Message Passing Set-Agreement
DISC '08 Proceedings of the 22nd international symposium on Distributed Computing
Critical sections: re-emerging scalability concerns for database storage engines
Proceedings of the 4th international workshop on Data management on new hardware
Formal Analysis of the Bakery Protocol with Consideration of Nonatomic Reads and Writes
ICFEM '08 Proceedings of the 10th International Conference on Formal Methods and Software Engineering
A Comparison of the M-PCP, D-PCP, and FMLP on LITMUSRT
OPODIS '08 Proceedings of the 12th International Conference on Principles of Distributed Systems
Energy-optimal synchronization primitives for single-chip multi-processors
Proceedings of the 19th ACM Great Lakes symposium on VLSI
Adaptive mutual exclusion with local spinning
Distributed Computing
Randomized mutual exclusion in O(log N / log log N) RMRs
Proceedings of the 28th ACM symposium on Principles of distributed computing
Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
A Methodology to Characterize Critical Section Bottlenecks in DSM Multiprocessors
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Analyzing lock contention in multithreaded applications
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Paper: Deadlock detection without wait-for graphs
Parallel Computing
Proceedings of the 7th ACM international conference on Computing frontiers
Group mutual exclusion in O(log n) RMR
Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Adaptive randomized mutual exclusion in sub-logarithmic expected time
Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Constant RMR solutions to reader writer synchronization
Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Parallel image thinning through topological operators on shared memory parallel machines
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
Adaptive multithreaded H.264/AVC decoding
Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
Making lockless synchronization fast: performance implications of memory reclamation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Laws of order: expensive synchronization in concurrent algorithms cannot be eliminated
Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Automating cut-off for multi-parameterized systems
ICFEM'10 Proceedings of the 12th international conference on Formal engineering methods and software engineering
Specification and constant RMR algorithm for phase-fair reader-writer lock
ICDCN'11 Proceedings of the 12th international conference on Distributed computing and networking
Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
Efficient synchronization for embedded on-chip multiprocessors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Brief announcement: a partitioned ticket lock
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
Verification of semantic commutativity conditions and inverse operations on linked data structures
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
A complexity separation between the cache-coherent and distributed shared memory models
Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Read/Write based fast-path transformation for FCFS mutual exclusion
SOFSEM'05 Proceedings of the 31st international conference on Theory and Practice of Computer Science
Speeding-up synchronizations in DSM multiprocessors
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
A practical single-register wait-free mutual exclusion algorithm on asynchronous networks
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
On the cost of concurrency in transactional memory
OPODIS'11 Proceedings of the 15th international conference on Principles of Distributed Systems
Tight time-space tradeoff for mutual exclusion
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
A tight RMR lower bound for randomized mutual exclusion
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Batch-pipelining for multicore H.264 decoding
Journal of Visual Communication and Image Representation
Cost of mutual exclusion with spin locks on multi-core CPUs
BICA'12 Proceedings of the 5th WSEAS congress on Applied Computing conference, and Proceedings of the 1st international conference on Biologically Inspired Computation
Design, verification and applications of a new read-write lock algorithm
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Brief announcement: a tight RMR lower bound for randomized mutual exclusion
PODC '12 Proceedings of the 2012 ACM symposium on Principles of distributed computing
The Journal of Supercomputing
DISC'12 Proceedings of the 26th international conference on Distributed Computing
RMR-efficient randomized abortable mutual exclusion
DISC'12 Proceedings of the 26th international conference on Distributed Computing
Abortable reader-writer locks are no more complex than abortable mutex locks
DISC'12 Proceedings of the 26th international conference on Distributed Computing
Science of Computer Programming
Time analysable synchronisation techniques for parallelised hard real-time applications
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
ACM SIGOPS 24th Symposium on Operating Systems Principles
Everything you always wanted to know about synchronization but were afraid to ask
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles
DANBI: dynamic scheduling of irregular stream programs for many-core systems
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Lightweight contention management for efficient compare-and-swap operations
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Efficient multiprogramming for multicores with SCAF
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Leveraging hardware message passing for efficient thread synchronization
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Accelerating sequential programs on commodity multi-core processors
Journal of Parallel and Distributed Computing
Elimination Trees and the Construction of Pools and Stacks
Theory of Computing Systems
Architectural Decomposition of Video Decoders by Meansof an Intermediate Data Stream Format
Journal of Signal Processing Systems
Hi-index | 0.01 |
The author examines the questions of whether there are efficient algorithms for software spin-waiting given hardware support for atomic instructions, or whether more complex kinds of hardware support are needed for performance. He considers the performance of a number of software spin-waiting algorithms. Arbitration for control of a lock is in many ways similar to arbitration for control of a network connecting a distributed system. He applies several of the static and dynamic arbitration methods originally developed for networks to spin locks. A novel method is proposed for explicitly queueing spinning processors in software by assigning each a unique number when it arrives at the lock. Control of the lock can then be passed to the next processor in line with minimal effecton other processors.