Distributing Hot-Spot Addressing in Large-Scale Multiprocessors
IEEE Transactions on Computers
Random number generators: good ones are hard to find
Communications of the ACM
Efficient synchronization primitives for large-scale cache-coherent multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
A software instruction counter
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Adaptive backoff synchronization techniques
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Two fast implementations of the “minimal standard” random number generator
Communications of the ACM
ACM Transactions on Programming Languages and Systems (TOPLAS)
Counting networks and multi-processor coordination
STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Process coordination with fetch-and-increment
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
A simple load balancing scheme for task allocation in parallel machines
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Low contention load balancing on large-scale multiprocessors
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
ACM Transactions on Programming Languages and Systems (TOPLAS)
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR
THE MIT ALEWIFE MACHINE: A LARGE-SCALE DISTRIBUTED-MEMORY MULTIPROCESSOR
Algorithms for Scalable Synchronization on Shared-Memory Multiproceessors
Algorithms for Scalable Synchronization on Shared-Memory Multiproceessors
Elimination trees and the construction of pools and stacks: preliminary version
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
ACM Transactions on Computer Systems (TOCS)
A steady state analysis of diffracting trees (extended abstract)
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Counting networks are practically linearizable
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
The strength of counting networks
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
An inherent bottleneck in distributed counting
PODC '97 Proceedings of the sixteenth annual ACM symposium on Principles of distributed computing
The Impact of Timing on Linearizability in Counting Networks
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Linearizable counting networks
Distributed Computing
Hi-index | 0.00 |
Shared counters are among the most basic coordination structures in multiprocessor computation, with applications ranging from barrier synchronization to dynamic load balancing. Introduced in this paper are diffracting trees, novel distributed-parallel data structures for shared counting. Diffracting trees combine a randomized coordination method together with a combinatorial data structure, to yield a logarithmic depth counter that improves on the log2 depth of counting networks, and overcomes the resiliency drawbacks of combining trees. Empirical evidence collected on a simulated distributed shared-memory multiprocessor shows that diffracting trees substantially outperform both combining trees and counting networks, currently the most effective known methods for shared counting. Not only do diffracting trees have higher throughput and lower latency, but unlike any known technique, their latency remains almost constant as the number of processors increases.