ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Analysis of cache invalidation patterns in multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Delayed consistency and its effects on the miss rate of parallel programs
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Simulation of multiprocessors: accuracy and performance
Simulation of multiprocessors: accuracy and performance
An adaptive cache coherence protocol optimized for migratory sharing
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The Stanford FLASH multiprocessor
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Journal of Parallel and Distributed Computing
Future Generation Computer Systems
Limits on Interconnection Network Performance
IEEE Transactions on Parallel and Distributed Systems
An Adaptive Update-Based Cache Coherence Protocol for Reduction of Miss Rate and Traffic
PARLE '94 Proceedings of the 6th International PARLE Conference on Parallel Architectures and Languages Europe
SPLASH: Stanford parallel applications for shared-memory
SPLASH: Stanford parallel applications for shared-memory
SFCS '86 Proceedings of the 27th Annual Symposium on Foundations of Computer Science
Using prediction to accelerate coherence protocols
Proceedings of the 25th annual international symposium on Computer architecture
Proceedings of the 30th annual international symposium on Computer architecture
Coherence decoupling: making use of incoherence
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
An adaptive cache coherence protocol for chip multiprocessors
Proceedings of the Second International Forum on Next-Generation Multicore/Manycore Technologies
Bandwidth Adaptive Cache Coherence Optimizations for Chip Multiprocessors
International Journal of Parallel Programming
Hi-index | 0.00 |
While update protocols generally induce lower miss rates than invalidate protocols, they tend to generate much traffic. This is one of the reasons why they are considered less cost-effectively scalable than invalidate protocols and, as a result, are avoided in most existing designs of scalable shared-memory multiprocessors. However, given the increasing relative cost of cache misses, update protocols are becoming more worthy of exploration. In this paper, we present a model of sharing that is key to investigating the performance of optimized update protocols: the Update Distance Model. The model gives insight into the update patterns that optimized protocols need to handle. Using this model, we design a new family of protocols that we call Distance- Adaptive Protocols. In these schemes, the directory records the update patterns observed and then uses them to selectively send updates and invalidatins to processors. As a result, traffic and miss rates are kept low. We present an implementation of these protocols based on a dynamic pointer schee. A performance comparison between one of these protocols and efficient invalidate and delayed competitive-update protocols over five applicaions shows that the new protocol decreases the execution time by an average of 15% and 10% respectively.