International Journal of Parallel Programming
Two algorithms for barrier synchronization
International Journal of Parallel Programming
Global quiescence detection based on credit distribution and recovery
Information Processing Letters
A message-optimal algorithm for distributed termination detection
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
Distributed Hardwired Barrier Synchronization for Scalable Multiprocessor Clusters
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Designing Tree-Based Barrier Synchronization on 2D Mesh Networks
IEEE Transactions on Parallel and Distributed Systems
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
A More Efficient Message-Optimal Algorithm for Distributed Termination Detection
IPPS '92 Proceedings of the 6th International Parallel Processing Symposium
Barrier Synchronization Techniques for Distributed Process Creation
Proceedings of the 8th International Symposium on Parallel Processing
Communication Pattern Based Methodology for Performance Analysis of Termination Detection Schemes
ICPADS '02 Proceedings of the 9th International Conference on Parallel and Distributed Systems
Tiered Algorithm for Distributed Process Quiescence and Termination Detection
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
A fast, wire-efficient synchronization technique is developed that supports dynamic allocation of multiple threads on shared-memory, message-passing, and/or single-chip multiprocessors. The proposed distributed-sum bit-comparison (DSBC) method employs the execution-sequence invariant property such that the instantaneous process production equals the instantaneous process consumption only upon barrier completion. For a system of n processing elements (PEs), a single instance of the global logic unit, and n instances of the local logic unit, interconnected by 3n wires, are shown to provide direct support for any arbitrary number of barriers. The barrier detection time is shown to scale linearly in terms of the number of active barriers in the system. Comparisons to Wired-NOR hardware and Shared-Lock software approaches indicate reduced barrier detection time, decreased inter-PE wiring requirements, and increased functionality. Suitability of adaptation of the DSBC method to a skew-insensitive clockless design is also discussed.