Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
SUIF: an infrastructure for research on parallelizing and optimizing compilers
ACM SIGPLAN Notices
Fine-grain access control for distributed shared memory
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Online data-race detection via coherency guarantees
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Optimizing communication in HPF programs on fine-grain distributed shared memory
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Eraser: a dynamic data race detector for multi-threaded programs
Proceedings of the sixteenth ACM symposium on Operating systems principles
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Enhancing Software DSM for Compiler-Parallelized Applications
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
The relative importance of concurrent writers and weak consistency models
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
TreadMarks: distributed shared memory on standard workstations and operating systems
WTEC'94 Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
Improving Compiler and Run-Time Support for Irregular Reductions Using Local Writes
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
An Evaluation of Page Aggregation Technique on Different DSM Systems
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Measuring Consistency Costs for Distributed Shared Data
LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Combined compile-time and runtime-driven, pro-active data movement in software DSM systems
LCR '04 Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems
Improving the Performance of Software Distributed Shared Memory with Speculation
IEEE Transactions on Parallel and Distributed Systems
Lazy home-based protocol for a software distributed shared memory system
International Journal of High Performance Computing and Networking
Hi-index | 0.00 |
Software DSMs have been a research topic for over a decade. While good performance has been achieved in some cases. consistent performance has continued to elude researchers. This paper investigates the performance of DSM protocols running highly regular scientific applications. Such applications should be ideal targets for DSM research because past behavior gives complete, or nearly complete, information about future behavior. We show that a modified home-based protocol can significantly outperform more general protocols in this application domain because of reduced protocol complexity.Nonetheless, such protocols still do not perform as well as expected. We show that the one of the major factors limiting performance is interaction with the operating system on page faults and page protection changes. We further optimize our protocol by completely eliminating such memory manipulation calls from the steady-state execution. Our resulting protocol improves average application performance by a further 34%, on top of the 19% improvement gained by our initial modification of the home-based protocol.