Firefly: A Multiprocessor Workstation
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Evaluating the performance of four snooping cache coherency protocols
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Performance evaluation of memory consistency models for shared-memory multiprocessors
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The Stanford Dash Multiprocessor
Computer
SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
An adaptive cache coherence protocol optimized for migratory sharing
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Cache write policies and performance
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Combined performance gains of simple cache protocol extensions
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
An Adaptive Update-Based Cache Coherence Protocol for Reduction of Miss Rate and Traffic
PARLE '94 Proceedings of the 6th International PARLE Conference on Parallel Architectures and Languages Europe
Writes caches as an alternative to write buffers
Writes caches as an alternative to write buffers
A New Solution to Coherence Problems in Multicache Systems
IEEE Transactions on Computers
SFCS '86 Proceedings of the 27th Annual Symposium on Foundations of Computer Science
An adaptive cache coherence protocol for chip multiprocessors
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
Bandwidth Adaptive Cache Coherence Optimizations for Chip Multiprocessors
International Journal of Parallel Programming
Hi-index | 0.00 |
Coherence misses limit the performance of write-invalidate cache protocols in large-scale shared-memory multi-processors. By contrast, hybrid protocols mix updates with invalidations and can reduce the coherence miss rate. The gains of the fewer coherence misses, however, can sometimes be outweighed by contention due to the extra traffic making techniques to cut the write traffic important. We study in this paper how write traffic for hybrid protocols can be reduced by incorporating a write cache in each node. Detailed architectural simulations reveal that write caches are effective in exploiting locality in write accesses under relaxed memory consistency models. Hybrid protocols augmented with write caches with only a few entries are shown to outperform a write-invalidate protocol for all five benchmark applications under study.