Boosting the performance of hybrid snooping cache protocols

Authors:
Fredrik Dahlgren
Affiliations:
Department of Computer Engineering, Lund University, P.O. Box 118, S-221 00 LUND, Sweden
Venue:
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Year:
1995

Citing 13
Cited 12

Firefly: a multiprocessor workstation

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
A characterization of sharing in parallel programs and its application to coherency protocol evaluation

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A cache coherence approach for large multiprocessor systems

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Evaluating the performance of four snooping cache coherency protocols

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Performance evaluation of memory consistency models for shared-memory multiprocessors

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
SPLASH: Stanford parallel applications for shared-memory

ACM SIGARCH Computer Architecture News
Cache write policies and performance

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Using write caches to improve performance of cache coherence protocols in shared-memory multiprocessors

Journal of Parallel and Distributed Computing
Dynamic decentralized cache schemes for mimd parallel processors

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A low-overhead coherence solution for multiprocessors with private cache memories

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Two techniques for improving performance on bus-based multiprocessors

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
The Prospects for On-Line Hybrid Coherency Protocols on Bus-Based Multiprocessors

The Prospects for On-Line Hybrid Coherency Protocols on Bus-Based Multiprocessors

Coherent network interfaces for fine-grain communication

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Analytical Prediction of Performance for Cache Coherence Protocols

IEEE Transactions on Computers
Ace: a language for parallel programming with customizable protocols

ACM Transactions on Computer Systems (TOCS)
Multiplex: unifying conventional and speculative thread-level parallelism on a chip multiprocessor

ICS '01 Proceedings of the 15th international conference on Supercomputing
Speculative Versioning Cache

IEEE Transactions on Parallel and Distributed Systems
Two Adaptive Hybrid Cache Coherency Protocols

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Bus-based COMA-reducing traffic in shared-bus multiprocessors

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors

Proceedings of the 30th annual international symposium on Computer architecture
Coherence decoupling: making use of incoherence

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
An adaptive cache coherence protocol for chip multiprocessors

Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
Service based communication for MPSoC platform-SegBus

Microprocessors & Microsystems
Bandwidth Adaptive Cache Coherence Optimizations for Chip Multiprocessors

International Journal of Parallel Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous studies of bus-based shared-memory multiprocessors have shown hybrid write-invalidate/write-update snooping protocols to be incapable of providing consistent performance improvements over write-invalidate protocols. In this paper, we analyze the deficiencies of hybrid snooping protocols under release consistency, and show how these deficiencies can be dramatically reduced by using write caches and read snarfing.Our performance evaluation is based on program-driven simulation and a set of five scientific applications with different sharing behaviors including migratory sharing as well us producer-consumer sharing. We show that a hybrid protocol, extended with write caches as well as read snarfing, manages to reduce the number of coherence misses by between 83% and 95% as compared to a write-invalidate protocol for all five applications in this study. In addition, the number of bus transactions is reduced by between 36% and 60% for four of the applications and by 9% for the fifth application. Because of the small implementation cost of the hybrid protocol and the two extensions, we believe that this combination is an effective approach to boost the performance of bus-based multiprocessors.