Where is time spent in message-passing and shared-memory programs?
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Reactive Proxies: A Flexible Protocol Extension to Reduce ccNUMA Node Controller Contention
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Two Adaptive Hybrid Cache Coherency Protocols
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Design and Performance of Directory Caches for Scalable Shared Memory Multiprocessors
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory Multiprocessors
The Effects of Latency, Occupancy, and Bandwidth in Distributed Shared Memory Multiprocessors
The flash multiprocessor: designing a flexible and scalable system
The flash multiprocessor: designing a flexible and scalable system
Hi-index | 0.00 |
A performance bottleneck arises in distributed shared-memory multiprocessors when there are many simultaneous requests for the same data. One architectural solution is to distribute read requests to nodes other than the home node: these other nodes act as intermediaries (i.e. proxies) in obtaining the data, and combine requests for the same data. Adaptive proxies use proxying during the proxying period, which varies depending on the level of run-time congestion. Simulation results show that adaptive proxies give performance improvements for all our benchmark applications.