PVM: a framework for parallel distributed computing
Concurrency: Practice and Experience
Tolerating latency through software-controlled prefetching in shared-memory multiprocessors
Journal of Parallel and Distributed Computing - Special issue on shared-memory multiprocessors
Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
The MIT Alewife machine: architecture and performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Thread scheduling for cache locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Thread migration and its applications in distributed shared memory systems
Journal of Systems and Software
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory Channel Network for PCI
IEEE Micro
The relative importance of concurrent writers and weak consistency models
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Multi-threading and remote latency in software DSMs
ICDCS '97 Proceedings of the 17th International Conference on Distributed Computing Systems (ICDCS '97)
An asynchronous protocol for release consistent distributed shared memory systems
SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
An Effective Selection Policy for Load Balancing in Software DSM
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Maximizing Speedup through Performance Prediction for Distributed Shared Memory Systems
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
A Group-Based Load Balance Scheme for Software Distributed Shared Memory Systems
The Journal of Supercomputing
Performance and modularity benefits of message-driven execution
Journal of Parallel and Distributed Computing
On the design and implementation of a portable DSM system for low-cost multicomputers
ICCSA'03 Proceedings of the 2003 international conference on Computational science and its applications: PartI
The thread migration mechanism of DSM-PEPE
ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Hi-index | 14.98 |
This paper evaluates the use of per-node multithreading to hide remote memory and synchronization latencies in software DSMs. As with hardware systems, multithreading in software systems can be used to reduce the costs of remote requests by running other threads when the current thread blocks. We added multithreading to the CVM software DSM and evaluated its impact on the performance of a suite of common shared memory programs. Multithreading resulted in speed improvements of at least 20 percent in two of the applications, and better than 15 percent for several other applications. However, we also found that good performance cannot always be achieved transparently for nontrivial applications. Also, the characteristics of the underlying DSM protocol can have a large effect on multithreading's utility.