Building expressive, area-efficient coherence directories
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Using in-flight chains to build a scalable cache coherence protocol
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
The problem of cache coherence in large-scale shared-memory multiprocessors has been addressed using directory-schemes. Two problems arise when the number of processors increases: the network latency increases and the implementation cost must be kept acceptable. The authors present a tree-based cache coherence protocol called the scalable tree protocol (STP). They show that it can be implemented at a reasonable implementation cost and that the write latency is logarithmic to the size of the sharing set. How to maintain an optimal tree structure and how to handle replacements efficiently are critical issues the authors address for this type of protocol. They compare the performance of the STP with that of the scalable coherent interface (SCI) (IEEE standard P1596) by considering a classical matrix-oriented algorithm targeted for large-scale parallel processing. They show that the STP manages to reduce the execution time considerably by reducing the write latency.