Parallel computing: theory and comparisons
Parallel computing: theory and comparisons
Structure of Computers and Computations
Structure of Computers and Computations
The NYU Ultracomputer—designing a MIMD, shared-memory parallel machine (Extended Abstract)
ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
Some implementations of segment sequential functions
ISCA '76 Proceedings of the 3rd annual symposium on Computer architecture
A vector and array multiprocessor extension of the sylvan architecture
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A performance bound analysis of multistage combining networks using a probabilistic model
ICS '91 Proceedings of the 5th international conference on Supercomputing
Fast barrier synchronization hardware
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
An effective synchronization network for hot-spot accesses
ACM Transactions on Computer Systems (TOCS)
A Cost-Effective Combining Structure for Large-Scale Shared-Memory Multiprocessors
IEEE Transactions on Computers
Request Combining in Multiprocessors with Arbitrary Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
Restricted Fetch and Φ operations for parallel processing
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Hi-index | 0.00 |
An efficient fetch-and-op circuit is described. A bit-serial circuit-switched implementation requires only 5 gates per node in a binary tree. This versatile circuit is also capable of test-and-set primitives (priority circuits) and swap operators, as well as AND and OR operations used in SIMD tests such as “branch on all carries set.” It provides an alternative implementation for the combining fetch-and-add circuit to the one designed for the Ultracomputer project; this implementation is suited to SIMD computing and can be adapted to MIMD computing.