Two proposals for the inclusion of directory information in the last-level private caches of glueless shared-memory multiprocessors

Authors:
Alberto Ros;Ricardo Fernández-Pascual;Manuel E. Acacio;José M. García
Affiliations:
Departamento de Ingeniería y Tecnología de Computadores, Universidad de Murcia, 30080 Murcia, Spain;Departamento de Ingeniería y Tecnología de Computadores, Universidad de Murcia, 30080 Murcia, Spain;Departamento de Ingeniería y Tecnología de Computadores, Universidad de Murcia, 30080 Murcia, Spain;Departamento de Ingeniería y Tecnología de Computadores, Universidad de Murcia, 30080 Murcia, Spain
Venue:
Journal of Parallel and Distributed Computing
Year:
2008

Citing 27
Cited 0

An evaluation of directory schemes for cache coherence

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
LimitLESS directories: A scalable cache coherence scheme

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The Stanford Dash Multiprocessor

Computer
Parallel programming in Split-C

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Cache coherence directories for scalable multiprocessors

Cache coherence directories for scalable multiprocessors
An evaluation of directory protocols for medium-scale shared-memory multiprocessors

ICS '94 Proceedings of the 8th international conference on Supercomputing
Efficient support for irregular applications on distributed-memory machines

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Hitting the memory wall: implications of the obvious

ACM SIGARCH Computer Architecture News
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
STiNG: a CC-NUMA computer system for the commercial marketplace

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
An empirical evaluation of two memory-efficient directory methods

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Timestamp snooping: an approach for extending SMPs

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Multiprocessors Should Support Simple Memory-Consistency Models

Computer
RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors

Computer
Introduction to the Special Section on High Performance Memory Systems

IEEE Transactions on Computers
Segment Directory Enhancing the Limited Directory Cache Coherence Schemes

IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Switch Cache: A Framework for Improving the Remote Memory Access Latency of CC-NUMA Multiprocessors

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Using Switch Directories to Speed Up Cache-to-Cache Transfers in CC-NUMA Multiprocessors

IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
Token coherence: decoupling performance and correctness

Proceedings of the 30th annual international symposium on Computer architecture
Bandwidth Adaptive Snooping

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
An Architecture for High-Performance Scalable Shared-Memory Multiprocessors Exploiting On-Chip Integration

IEEE Transactions on Parallel and Distributed Systems
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
An efficient cache design for scalable glueless shared-memory multiprocessors

Proceedings of the 3rd conference on Computing frontiers
High-throughput coherence control and hardware messaging in everest

IBM Journal of Research and Development
A novel lightweight directory architecture for scalable shared-memory multiprocessors

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In glueless shared-memory multiprocessors where cache coherence is usually maintained using a directory-based protocol, the fast access to the on-chip components (caches and network router, among others) contrasts with the much slower main memory. Unfortunately, directory-based protocols need to obtain the sharing status of every memory block before coherence actions can be performed. This information has traditionally been stored in main memory, and therefore these cache coherence protocols are far from being optimal. In this work, we propose two alternative designs for the last-level private cache of glueless shared-memory multiprocessors: the lightweight directory and the SGluM cache. Our proposals completely remove directory information from main memory and store it in the home node's L2 cache, thus reducing both the number of accesses to main memory and the directory memory overhead. The main characteristics of the lightweight directory are its simplicity and the significant improvement in the execution time for most applications. Its drawback, however, is that the performance of some particular applications could be degraded. On the other hand, the SGluM cache offers more modest improvements in execution time for all the applications by adding some extra structures that cope with the cases in which the lightweight directory fails.