Modeling the performance of limited pointers directories for cache coherence
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Distributed texture memory in a multi-GPU environment
GH '06 Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware
Hi-index | 0.00 |
Directory-based cache consistency protocols have the potential to allow shared-memory multiprocessors to scale to a large number of processors. While many variations of these coherence schemes exist in the literature, they have typically been described at a rather high level, making adequate evaluation difficult. This paper explores the implementation issues of directory-based coherency strategies by developing a design at the level of detail needed to write a memory system functional simulator with an accurate timing model. The paper presents the design of both an invalidation coherency protocol and the associated directory/memory hardware. Support is added to prevent deadlock, handle subtle consistency situations, and implement a proper programming model of multiprocess execution. Extensions are delineated for realizing a multiple-threaded directory that can continue to process commands while waiting for a reply from a cache. The final hardware design is evaluated in the context of the number of parts required for implementation.