Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Interconnect-power dissipation in a microprocessor
Proceedings of the 2004 international workshop on System level interconnect prediction
Improving Multiple-CMP Systems Using Token Coherence
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence
Proceedings of the 32nd annual international symposium on Computer Architecture
Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking
Proceedings of the 32nd annual international symposium on Computer Architecture
A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Token tenure: PATCHing token counting using directory-based cache coherence
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Reactive NUCA: near-optimal block placement and replication in distributed caches
Proceedings of the 36th annual international symposium on Computer architecture
Subspace snooping: filtering snoops with operating system support
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A Direct Coherence Protocol for Many-Core Chip Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks
Proceedings of the 38th annual international symposium on Computer architecture
ACM SIGARCH Computer Architecture News
Hi-index | 0.00 |
Future many-core chip multiprocessors (CMPs) will integrate hundreds of processor cores on chip. Two cache coherence protocols are the mainstream applied to current CMPs. The token-based protocol (Token) provides high performance, but it generates a prohibitive amount of network traffic, which translates into excessive power consumption. The directory-based protocol (Directory) reduces network traffic, yet trades off with the storage overhead of the directory as well as entails comparatively low performance caused by indirection limiting its applicability for many-core CMPs.In this work, we present DP&TB, a novel cache coherence protocol particularly suited to future many-core CMPs. In DP&TB, cache coherence is maintained at the granularity of a page, facilitating to filter out either unnecessary coherence inspections for blocks inside private pages or network traffic for blocks inside shared pages. We employ Directory to detect private and shared pages and Token to maintain the coherence of the blocks inside shared pages. DP&TB inherits the merit of Directory and Token and overcome their problems. Experimental results show that DP&TB comprehensively beyond Directory and Token with improvement by 9.1 % in performance over Token and by 13.8 % in network traffic over Directory. In addition, the storage overhead of DP&TB is less than half of that of Directory. Our proposal can fulfill the requirement of many-core CMPs to achieve high performance, power and area efficiency.