An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors

Authors:
Yeimkuan Chang;Laxmi N. Bhuyan
Affiliations:
Chung-Hua Univ., Taiwan, Republic of China;Texas A&M Univ., College Station
Venue:
IEEE Transactions on Computers
Year:
1999

Citing 17
Cited 7

Synchronization, Coherence, and Event Ordering in Multiprocessors

Computer
An evaluation of directory schemes for cache coherence

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Analysis of cache invalidation patterns in multiprocessors

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Scalable coherent interface

Computer
Stanford distributed-directory protocol

Computer
LimitLESS directories: A scalable cache coherence scheme

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Cooperative shared memory: software and hardware for scalable multiprocessor

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A scalable coherent cache system with a dynamic pointing scheme

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons

ACM Computing Surveys (CSUR)
Extending the scalable coherent interface for large-scale shared-memory multiprocessors

Extending the scalable coherent interface for large-scale shared-memory multiprocessors
Mechanisms for cooperative shared memory

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Software-extended coherent shared memory: performance and cost

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
IEEE Standard for Scalable Coherent Interface, Science: IEEE Std. 1596-1992

IEEE Standard for Scalable Coherent Interface, Science: IEEE Std. 1596-1992
The DASH Prototype: Logic Overhead and Performance

IEEE Transactions on Parallel and Distributed Systems
Kiloprocessor Extensions to SCI

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR

PROTEUS: A HIGH-PERFORMANCE PARALLEL-ARCHITECTURE SIMULATOR
SPLASH: Stanford parallel applications for shared-memory*

SPLASH: Stanford parallel applications for shared-memory*

ADir_pNB: A Cost-Effective Way to Implement Full Map Directory-Based Cache Coherence Protocols

IEEE Transactions on Computers
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
A consistency architecture for hierarchical shared caches

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
On Scalable Synchronization for Distributed Embedded Real-Time Systems

SEUS '08 Proceedings of the 6th IFIP WG 10.2 international workshop on Software Technologies for Embedded and Ubiquitous Systems
Virtual tree coherence: Leveraging regions and in-network multicast trees for scalable cache coherence

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
A two-level directory organization solution for CC-NUMA systems

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
A new hybrid directory scheme for shared memory multi-processors

CSR'06 Proceedings of the First international computer science conference on Theory and Applications

Quantified Score

Hi-index	14.98

Visualization

Abstract

Directory schemes have long been used to solve the cache coherence problem for large scale shared memory multiprocessors. In addition, tree-based protocols have been employed to reduce the directory size and the invalidation latency for a large degree of data sharing in the system. However, the existing tree-based protocols involve a very high communication overhead for maintaining a balanced tree, especially when the degree of data sharing is low. This paper presents a new tree-based cache coherence protocol which is a hybrid of the limited directory and the linked list schemes. By utilizing a limited number of pointers in the directory, the proposed protocol connects the nodes caching a shared block in a tree fashion without incurring any communication overhead. In addition to the low communication overhead, the proposed scheme also possesses the advantages of the existing bit-map and tree-based linked list protocols, namely, scalable memory requirement and logarithmic invalidation latency. We evaluate the performance of our protocol by running four applications on the Proteus execution-driven simulator. Our simulation results show that the performance of the proposed protocol is very close to that of the full-map protocol.