Combinable memory-block transactions

Authors:
Guy E. Blelloch;Phillip B. Gibbons;S. Harsha Vardhan
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA;Intel Research Pittsburgh, Pittsburgh, PA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Year:
2008

Citing 18
Cited 6

Efficient synchronization of multiprocessors with shared memory

ACM Transactions on Programming Languages and Systems (TOPLAS)
Linearizability: a correctness condition for concurrent objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Wait-free synchronization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Algorithms for scalable synchronization on shared-memory multiprocessors

ACM Transactions on Computer Systems (TOCS)
How to emulate shared memory

Journal of Computer and System Sciences
An introduction to parallel algorithms

An introduction to parallel algorithms
Simple, fast, and practical non-blocking and blocking concurrent queue algorithms

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Contention in shared memory algorithms

Journal of the ACM (JACM)
The queue-read queue-write asynchronous PRAM model

Theoretical Computer Science - Special issue on parallel computing
Parallel hashing: an efficient implementation of shared memory

Journal of the ACM (JACM)
Concurrency Control in Distributed Database Systems

ACM Computing Surveys (CSUR)
Basic Techniques for the Efficient Coordination of Very Large Numbers of Cooperating Sequential Processors

ACM Transactions on Programming Languages and Systems (TOPLAS)
Time and Space Lower Bounds for Nonblocking Implementations

SIAM Journal on Computing
The DASH Prototype: Logic Overhead and Performance

IEEE Transactions on Parallel and Distributed Systems
On the inherent weakness of conditional primitives

Distributed Computing - Special issue: PODC 04
Active memory operations

Proceedings of the 21st annual international conference on Supercomputing
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer

IEEE Transactions on Computers

Preliminary results on nb-feb, a synchronization primitive for parallel programming

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
NB-FEB: A Universal Scalable Easy-to-Use Synchronization Primitive for Manycore Architectures

OPODIS '09 Proceedings of the 13th International Conference on Principles of Distributed Systems
Active memory controller

The Journal of Supercomputing
Fast concurrent queues for x86 processors

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Reducing contention through priority updates

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Reducing contention through priority updates

Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper formalizes and studies combinable memory-block transactions (MBTs). The idea is to encode short programs that operate on a single cache/memory block and then to specify such a program with a memory request. The code is then executed at the cache or memory controller, atomically with respect to other accesses to that block by this or other processors. The combinable form allows combining within the memory system or network. In addition to allowing for the standard set of read-modify-write operations (e.g., test-and-set, compare-and-swap, fetch-and-add), MBTs can be used to define other useful operations--such as a fetch-and-add that does not decrement below zero. We show how MBTs can be used to design simple and efficient implementations of a variety of protocols and algorithms, including a priority write, a semaphore with a non-blocking P operation, a bounded queue, and a timestamp-based transactional memory system. In all cases the protocols gain some advantage by using MBTs that are different from the standard set of operations. To gain an understanding of the efficiency that can be gained by using combining, we define a notion of bounded contention and show that all our protocols have bounded contention under arbitrary loads.