Efficient synchronization primitives for large-scale cache-coherent multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
LimitLESS directories: A scalable cache coherence scheme
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Combining hardware and software cache coherence strategies
ICS '91 Proceedings of the 5th international conference on Supercomputing
Modeling the performance of limited pointers directories for cache coherence
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
An argument against scalable cache coherency
ACM SIGARCH Computer Architecture News
Delayed consistency and its effects on the miss rate of parallel programs
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
The Stanford Dash Multiprocessor
Computer
Towards a shared-memory massively parallel multiprocessor
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Closing the window of vulnerability in multiphase memory transactions
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Cache consistency in hierarchical-ring-based multiprocessors
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons
ACM Computing Surveys (CSUR)
Cache coherence using local knowledge
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
A comprehensive bibliography of distributed shared memory
ACM SIGOPS Operating Systems Review
S-connect: from networks of workstations to supercomputer performance
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Verification techniques for cache coherence protocols
ACM Computing Surveys (CSUR)
Proceedings of the 24th annual international symposium on Computer architecture
An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors
IEEE Transactions on Computers
Optimal Clustering of Hierarchical Hyper-Ring Multicomputers
The Journal of Supercomputing
Performance of the hyper-ring multicomputer
SAC '98 Proceedings of the 1998 ACM symposium on Applied Computing
The hyper-ring network: a cost-efficient topology for scalable multicomputers
SAC '98 Proceedings of the 1998 ACM symposium on Applied Computing
ADir_pNB: A Cost-Effective Way to Implement Full Map Directory-Based Cache Coherence Protocols
IEEE Transactions on Computers
The Network RamDisk: Using remote memory on heterogeneous NOWs
Cluster Computing
Improving Memory Utilization in Cache Coherence Directories
IEEE Transactions on Parallel and Distributed Systems
Overview of high performance computers
Handbook of massive data sets
An Evaluation of Some Beowulf Clusters
Cluster Computing
CAS-DSM: a compiler assisted software distributed shared memory
International Journal of Parallel Programming
International Journal of Parallel Programming
The case for simple, visible cache coherency
Proceedings of the 2008 ACM SIGPLAN workshop on Memory systems performance and correctness: held in conjunction with the Thirteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '08)
Proximity coherence for chip multiprocessors
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A provably starvation-free distributed directory protocol
SSS'10 Proceedings of the 12th international conference on Stabilization, safety, and security of distributed systems
Speeding-up synchronizations in DSM multiprocessors
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Building expressive, area-efficient coherence directories
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Hi-index | 4.11 |
The scalable coherent interface (SCI), a local or extended computer backplane interface being defined by an IEEE standard project (P1596), is discussed. the interconnection is scalable, meaning that up to 64 K processor, memory, or I/O nodes can effectively interface to a shared SCI interconnection. The SCI sharing-list structures are described, and sharing-list addition and removal are examined. Optimizations being considered to improve the performance of large system configurations are discussed. Request combining, a useful feature of linked-list coherence, is described. SCI's optional extensions, including synchronization using a queued-on-lock bit, are considered