An evaluation of directory protocols for medium-scale shared-memory multiprocessors
ICS '94 Proceedings of the 8th international conference on Supercomputing
Efficient support for irregular applications on distributed-memory machines
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The SGI Origin: a ccNUMA highly scalable server
Proceedings of the 24th annual international symposium on Computer architecture
Architecture and design of AlphaServer GS320
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Timestamp snooping: an approach for extending SMPs
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Token coherence: decoupling performance and correctness
Proceedings of the 30th annual international symposium on Computer architecture
HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
IEEE Transactions on Parallel and Distributed Systems
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
An efficient cache design for scalable glueless shared-memory multiprocessors
Proceedings of the 3rd conference on Computing frontiers
Journal of Parallel and Distributed Computing
Scalable directory architecture for distributed shared memory chip multiprocessors
ACM SIGARCH Computer Architecture News
An Efficient Lightweight Shared Cache Design for Chip Multiprocessors
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
A Novel Cache Organization for Tiled Chip Multiprocessor
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
A scalable organization for distributed directories
Journal of Systems Architecture: the EUROMICRO Journal
Direct coherence: bringing together performance and scalability in shared-memory multiprocessors
HiPC'07 Proceedings of the 14th international conference on High performance computing
Hi-index | 0.01 |
There are two important hurdles that restrict the scalability of directory-based shared-memory multiprocessors: the directory memory overhead and the long L2 miss latencies due to the indirection introduced by the accesses to directory information, usually stored in main memory. This work presents a lightweight directory architecture aimed at facing these two important problems. Our proposal takes advantage of the temporal locality exhibited by the accesses to the directory information and on-chip integration to design a directory protocol with the best characteristics of snoopy protocols. The lightweight directory architecture removes the directory structure from main memory and it stores directory information in the L2 cache avoiding in most cases the access to main memory. The proposed architecture is evaluated based on extensive execution-driven simulations of a 32-node cc-NUMA multiprocessor. Results demonstrate that the lightweight directory architecture achieves better performance than a non-scalable full-map directory, with a very significant reduction on directory memory overhead.