Per-Node Multithreading and Remote Latency
IEEE Transactions on Computers
Responsiveness without interrupts
ICS '99 Proceedings of the 13th international conference on Supercomputing
An asynchronous protocol for release consistent distributed shared memory systems
SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
A Comparison of Two Strategies of Dynamic Data Prefetching in Software DSM
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
CAS-DSM: a compiler assisted software distributed shared memory
International Journal of Parallel Programming
Win32 API emulation on UNIX for software DSM
WINSYM'98 Proceedings of the 2nd conference on USENIX Windows NT Symposium - Volume 2
Hi-index | 0.01 |
This paper evaluates the use of per-node multi-threading to hide remote memory and synchronization latencies in a software DSM. As with hardware systems, multi-threading in software systems can be used to reduce the costs of remote requests by switching threads when the current thread blocks. We added multi-threading to the CVM software DSM and evaluated its impact on performance for a suite of common shared memory programs. Multi-threading resulted in speed improvements of at least 17% in three of the seven applications in our suite, and lesser improvements in the other applications. However, we found that: good performance is not always achievable transparently for non-trivial applications; multi-threading can negatively interact with DSM operations; multi-threading decreases cache and TLB locality; and any multi-threading speedup is dependent on available work.