An evaluation of directory schemes for cache coherence
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
LimitLESS directories: A scalable cache coherence scheme
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Coherence controller architectures for SMP-based CC-NUMA multiprocessors
Proceedings of the 24th annual international symposium on Computer architecture
An empirical evaluation of two memory-efficient directory methods
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence
Proceedings of the 32nd annual international symposium on Computer Architecture
Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking
Proceedings of the 32nd annual international symposium on Computer Architecture
WAYPOINT: scaling coherence to thousand-core architectures
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Hi-index | 0.00 |
Cache coherency protocols implemented in today's shared memory multiprocessor systems use snooping mechanism to keep the data correct and consistent between the caches and the system memory. This requires a large number of snoops sent out on the system interconnection links. However, published research has been shown that a large percentage of these snoops are not necessary or can be eliminated. To detect and eliminate these unnecessary snoops, several techniques have been proposed. But these techniques have not been evaluated using commercial server benchmarks and large caches that are common on today's server platforms. In this paper, we evaluate three popular snoop filtering techniques, namely Region Scout (RS), Region Coherence Array (RCA) and Directory Cache (DC), using four different commercial server workloads. We compare and contrast these three techniques and show how effective these techniques are in eliminating unnecessary snoops. These techniques differ in implementation approaches and the implementation differences yield accuracy and areas tradeoffs. We show 38% to 98% of the last level cache snoops are unnecessary in major commercial server benchmarks. With the snoop filtering techniques we are able to eliminate 35% to 97% of the unnecessary snoops with 1-3% additional die area.