Communications of the ACM - Special issue on parallelism
Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Distributing Hot-Spot Addressing in Large-Scale Multiprocessors
IEEE Transactions on Computers
Efficient synchronization of multiprocessors with shared memory
ACM Transactions on Programming Languages and Systems (TOPLAS)
A fetch-and-op implementation for parallel computers
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Efficient synchronization primitives for large-scale cache-coherent multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Scans as Primitive Parallel Operations
IEEE Transactions on Computers
Software combining algorithms for distributing hot-spot addressing
Journal of Parallel and Distributed Computing
Process coordination with fetch-and-increment
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Comparative performance evaluation of cache-coherent NUMA and COMA architectures
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
An effective synchronization network for hot-spot accesses
ACM Transactions on Computer Systems (TOCS)
Highly parallel computing
Extending the scalable coherent interface for large-scale shared-memory multiprocessors
Extending the scalable coherent interface for large-scale shared-memory multiprocessors
Toward the design of large-scale shared-memory multiprocessors
Toward the design of large-scale shared-memory multiprocessors
Restricted Fetch and Φ operations for parallel processing
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Journal of the ACM (JACM)
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
ACM Computing Surveys (CSUR)
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
A critique of multiprocessing von Neumann style
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
High-bandwidth address translation for multiple-issue processors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Hi-index | 0.00 |
Several techniques have been proposed to allow parallel access to a shared memorylocation by combining requests. They have one or more of the following attributes:requirements for a priori knowledge of the request to combine, restrictions on the routingof messages in the network, or the use of sophisticated interconnection network nodes.We present a new method of combining requests that does not have the aboverequirements. We obtain this new method for request combining by developing aclassification scheme for the existing methods of request combining. This classificationscheme is facilitated by separating the request combining process into a two partoperation: determining the combining set, which is the set of requests that participate ina combined access; and distributing the results of the combined access to the membersof the combining set. The classification of combining strategies is based upon whichsystem component, processor elements, or interconnection network performs each ofthese tasks. Our approach, which uses the interconnection network to establish thecombining set and the processor elements to distribute the results, lies in an unexploredarea of the design space. We also present simulation results to assess the benefits of theproposed approach.