A New Scalable Directory Architecture for Large-Scale Multiprocessors

Authors:
Manuel E. Acacio;José González;José M. García;José Duato
Affiliations:
-;-;-;-
Venue:
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Year:
2001

Citing 0
Cited 15

A Novel Approach to Reduce L2 Miss Latency in Shared-Memory Multiprocessors

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
The Use of Prediction for Accelerating Upgrade Misses in cc-NUMA Multiprocessors

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Owner prediction for accelerating cache-to-cache transfer misses in a cc-NUMA architecture

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
An Architecture for High-Performance Scalable Shared-Memory Multiprocessors Exploiting On-Chip Integration

IEEE Transactions on Parallel and Distributed Systems
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
In-Network Cache Coherence

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
A consistency architecture for hierarchical shared caches

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Distributed cooperative caching

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
A scalable organization for distributed directories

Journal of Systems Architecture: the EUROMICRO Journal
A two-level directory organization solution for CC-NUMA systems

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Reducing the latency of L2 misses in shared-memory multiprocessors through on-chip directory integration

EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Evaluation of low-overhead organizations for the directory in future many-core CMPs

Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
A new hybrid directory scheme for shared memory multi-processors

CSR'06 Proceedings of the First international computer science conference on Theory and Applications
Complexity-effective multicore coherence

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Building expressive, area-efficient coherence directories

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: The memory overhead introduced by directories constitutes a major hurdle in the scalability of cc-NUMA archiectures, which makes the shared-memory paradigm unfeasible for very large-scale systems. This work is focused on improving the scalability of shared-memory multiprocessors by significantly reducing the size of the directory. We propose multilayer clustering as an effective approach to reduce the directory-entry width. Detailed evaluation for 64 processors shows that using this approach we can drastically reduce the memory overhead, while suffering a performance degradation very similar to previous compressed schemes (such as Coarse Vector). In addition, a novel two-level directory architecture is proposed in order to eliminate the penalty caused by these compressed directories. This organization consists of a small Full-Map first- level directory (which provides precise information for the most recently referenced lines)and a compressed second- level directory (which provides in-excess information). Results show that a system with this directory architecture can achieve the same performance as a multiprocessor with a big and non-scalable Full-Map directory, with a very significant reduction of the memory overhead.