ACM Transactions on Computer Systems (TOCS)
A comparison of receiver-initiated and sender-initiated adaptive load sharing
Performance Evaluation
Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
ICS '93 Proceedings of the 7th international conference on Supercomputing
Micro benchmark analysis of the KSR1
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Communication in the KSR1 MPP: performance evaluation using synthetic workload experiments
ICS '94 Proceedings of the 8th international conference on Supercomputing
A quantitative analysis of cache policies for scalable network file systems
SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A compiler algorithm that reduces read latency in ownership-based cache coherence protocols
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Serverless network file systems
ACM Transactions on Computer Systems (TOCS) - Special issue on operating system principles
Data Forwarding in Scalable Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Temporal notions of synchronization and consistency in Beehive
Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Optimizing communication in HPF programs on fine-grain distributed shared memory
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Reactive NUMA: a design for unifying S-COMA and CC-NUMA
Proceedings of the 24th annual international symposium on Computer architecture
Hardware Support for Flexible Distributed Shared Memory
IEEE Transactions on Computers
Token coherence: decoupling performance and correctness
Proceedings of the 30th annual international symposium on Computer architecture
Cooperative caching: using remote client memory to improve file system performance
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Architectural support for thread communications in multi-core processors
Parallel Computing
Location-aware cache management for many-core processors with deep cache hierarchy
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Kendall Square Research introduced the KSR1 system in 1991. The architecture is based on a ring of rings of 64-bit microprocessora. It is a distributed, shared memory system and is scalable. The memory structure is unique and is the key to understanding the system. Different levels of caching eliminates physical memory addressing and leads to the ALLCACHE™ scheme. Since requested data may be found in any of several caches, the initial access time is variable. Once pulled into the local (sub) cache, subsequent access times are fixed and minimal. Thus, the KSR1 is a Cache-Only Memory Architecture (COMA) system.This paper describes experimentation and an analytic model of the KSR1. The focus is on the poststore programmer option. With the poststore option, the programm er can elect to broadcast the updated value of a variable to all processors that might have a copy. This may save time for threads on other processors, but delays the broadcasting thread and places additional traffic on the ring. The specific issue addressed is to determine under what conditions poststore is beneficial. The analytic model and the experimental observations are in good agreement. They indicate that the decision to use poststore depends both on the application and the current system load.