Coherency for multiprocessor virtual address caches
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Analysis of cache invalidation patterns in multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
The effect of sharing on the cache and bus performance of parallel programs
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Memory coherence in shared virtual memory systems
ACM Transactions on Computer Systems (TOCS)
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
Network-based concurrent computing on the PVM system
Concurrency: Practice and Experience
Lazy release consistency for software distributed shared memory
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Adjustable block size coherent caches
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Managing pages in shared virtual memory systems: getting the compiler into the game
ICS '93 Proceedings of the 7th international conference on Supercomputing
Reducing false sharing on shared memory multiprocessors through compile time data transformations
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Techniques for reducing consistency-related communication in distributed shared-memory systems
ACM Transactions on Computer Systems (TOCS)
Quantifying the performance differences between PVM and TreadMarks
Journal of Parallel and Distributed Computing
Relaxed consistency and coherence granularity in DSM systems: a performance evaluation
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Memory consistency and event ordering in scalable shared-memory multiprocessors
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
An interaction of coherence protocols and memory consistency models in DSM systems
ACM SIGOPS Operating Systems Review
Data prefetching for software DSMs
ICS '98 Proceedings of the 12th international conference on Supercomputing
Tapeworm: high-level abstractions of shared accesses
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
LOTEC: a simple DSM consistency protocol for nested object transactions
Proceedings of the eighteenth annual ACM symposium on Principles of distributed computing
A high-level abstraction of shared accesses
ACM Transactions on Computer Systems (TOCS)
Run-time support for distributed sharing in safe languages
ACM Transactions on Computer Systems (TOCS)
The Working-Set Based Adaptive Protocol for Software Distributed Shared Memory
HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Transparent Adaptation of Sharing Granularity in MultiView-Based DSM Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Multiple-writer entry consistency
Cluster computing
Locality and Performance of Page- and Object-Based DSMs
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Performance analysis of methods that overcome false sharing effects in software DSMs
Journal of Parallel and Distributed Computing
The region trap library: handling traps on application-defined regions of memory
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Region-Based Prefetch Techniques for Software Distributed Shared Memory Systems
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Adaptive conflict unit size for distributed optimistic synchronization
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Update protocols and cluster-based shared memory
Computer Communications
Hi-index | 0.00 |
Software Distributed Shared Memory (DSM) systems based on virtual memory techniques traditionally use the hardware page as the consistency unit. The large size of the hardware page is considered to be a performance bottleneck because of the implied false sharing overheads. Instead, we show that in the presence of a relaxed consistency model and a multiple writer protocol, a large consistency unit is generally not detrimental to performance. We study the tradeoffs between false sharing and aggregation effects when using large consistency units. In this context, this paper makes three separate contributions:1. We document the cost of false sharing in terms of extra messages and extra data being communicated. We find that, for the applications considered, when the virtual memory page is used as the consistency unit, the number of extra messages is small, while the amount of extra data can be substantial.2. We evaluate the performance when the consistency unit is increased to a multiple of the virtual memory page size. For most applications and data sets, the performance improves, except when the false sharing effects include extra messages or a large amount of extra data.3. We present a new algorithm for dynamically aggregating pages. In our algorithm, the aggregated pages do not necessarily need to be contiguous. In all cases, the performance of our dynamic aggregation algorithm is similar to that achieved with the best static page size.These results were obtained by measuring the performance of eight applications on the TreadMarks distributed shared memory system. The hardware platform used is a network of 166Mhz Pentiums connected by a switched 100Mbps Ethernet network.