The Amber system: parallel programming on a network of multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Memory coherence in shared virtual memory systems
ACM Transactions on Computer Systems (TOCS)
NUMA policies and their relation to memory architecture
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Distributed Shared Memory: A Survey of Issues and Algorithms
Computer - Distributed computing systems: separate resources acting as one
Experimental comparison of memory management policies for NUMA multiprocessors
ACM Transactions on Computer Systems (TOCS)
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Parallel Programming Using Shared Objects and Broadcasting
Computer - Special issue on sharing: high performance at low cost
The shared regions approach to software cache coherence on multiprocessors
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel programming in Split-C
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Virtual memory mapped network interface for the SHRIMP multicomputer
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Fine-grain access control for distributed shared memory
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
High performance software coherence for current and future architectures
Journal of Parallel and Distributed Computing - Special issue on distributed shared memory systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
CRL: high-performance all-software distributed shared memory
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
MGS: a multigrain shared memory system
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Shasta: a low overhead, software-only approach for supporting fine-grain shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
An integrated compile-time/run-time software distributed shared memory system
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
SoftFLASH: analyzing the performance of clustered distributed virtual shared memory
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
An implementation of the Hamlyn sender-managed interface architecture
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Memory Channel Network for PCI
IEEE Micro
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Cid: A Parallel, "Shared-Memory" C for Distributed-Memory Machines
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Using memory-mapped network interfaces to improve the performance of distributed shared memory
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Improving Release-Consistent Shared Virtual Memory using Automatic Update
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Temporal notions of synchronization and consistency in Beehive
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Cashmere-2L: software coherent shared memory on a clustered remote-write network
Proceedings of the sixteenth ACM symposium on Operating systems principles
Performance evaluation of the Orca shared-object system
ACM Transactions on Computer Systems (TOCS)
Monitoring shared virtual memory performance on a Myrinet-based PC cluster
ICS '98 Proceedings of the 12th international conference on Supercomputing
Evaluation of hardware write propagation support for next-generation shared virtual memory clusters
ICS '98 Proceedings of the 12th international conference on Supercomputing
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
MagPIe: MPI's collective communication operations for clustered wide area systems
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Space-time memory: a parallel programming abstraction for interactive multimedia applications
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Shared virtual memory with automatic update support
ICS '99 Proceedings of the 13th international conference on Supercomputing
Accelerating shared virtual memory via general-purpose network interface support
ACM Transactions on Computer Systems (TOCS)
The effects of communication parameters on end performance of shared virtual memory clusters
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Removing the overhead from software-based shared memory
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Shared State for Distributed Interactive Data Mining Applications
Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
Stampede: A Programming System for Emerging Scalable Interactive Multimedia Applications
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Journal of Parallel and Distributed Computing
Proceedings of the 18th annual international conference on Supercomputing
Shared memory computing on clusters with symmetric multiprocessors and system area networks
ACM Transactions on Computer Systems (TOCS)
Note: The distributed virtual shared-memory system based on the InfiniBand architecture
Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Study of OpenMP applications on the InfiniBand-based software distributed shared-memory system
Parallel Computing - OpenMp
ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
Evaluation of a virtual shared memory machine by the compilation of data parallel loops
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Hi-index | 0.00 |
Recent technological advances have produced network interfaces that provide users with very low-latency access to the memory of remote machines. We examine the impact of such networks on the implementation and performance of software DSM. Specifically, we compare two DSM systems---Cashmere and TreadMarks---on a 32-processor DEC Alpha cluster connected by a Memory Channel network.Both Cashmere and TreadMarks use virtual memory to maintain coherence on pages, and both use lazy, multi-writer release consistency. The systems differ dramatically, however, in the mechanisms used to track sharing information and to collect and merge concurrent updates to a page, with the result that Cashmere communicates much more frequently, and at a much finer grain.Our principal conclusion is that low-latency networks make DSM based on fine-grain communication competitive with more coarse-grain approaches, but that further hardware improvements will be needed before such systems can provide consistently superior performance. In our experiments, Cashmere scales slightly better than TreadMarks for applications with false sharing. At the same time, it is severely constrained by limitations of the current Memory Channel hardware. In general, performance is better for TreadMarks.