SPLASH: Stanford parallel applications for shared-memory
ACM SIGARCH Computer Architecture News
A performance evaluation of optimal hybrid cache coherency protocols
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
The detection and elimination of useless misses in multiprocessors
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Cache write policies and performance
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Essential misses and data traffic in coherence protocols
Journal of Parallel and Distributed Computing - Special issue on distributed shared memory systems
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
SS '95 Proceedings of the 28th Annual Simulation Symposium
The interaction of parallel programming constructs and coherence protocols
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Evaluating the Effect of Coherence Protocols on the Performance of Parallel Programming Constructs
International Journal of Parallel Programming
Hi-index | 0.01 |
In this paper we categorize the coherence traffic in update-based protocols and show that, for most applications, more than 90% of all updates generated by the protocol are unnecessary. We identify application characteristics that generate useless update traffic, and compare the isolated and combined effects of several software and hardware techniques for eliminating useless updates. These techniques include dynamic and static hybrid protocols, a data re-mapping strategy, and coalescing write buffers. Our simulations show that these techniques are effective for different types of useless updates. Overall, software caching (where dynamic data re-mapping is performed under programmer or compiler control) has the potential to significantly increase the percentage of useful traffic in applications. When software caching is not applicable, either the static or the dynamic protocol generates the least useless traffic. Although coalescing write buffers provide great reductions in the total number of messages transferred, these buffers do not necessarily increase the percentage of useful traffic.