Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
On the inclusion properties for multi-level cache hierarchies
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Multi-level shared caching techniques for scalability in VMP-M/C
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Characteristics of performance-optimal multi-level cache hierarchies
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Introducing memory into the switch elements of multiprocessor interconnection networks
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Fast Algorithms for Routing Around Faults in Multibutterflies and Randomly-Wired Splitter Networks
IEEE Transactions on Computers - Special issue on fault-tolerant computing
A distributed shared memory multiprocessor ASURA: memory and cache architecture
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
The GLOW cache coherence protocol extensions for widely shared data
ICS '96 Proceedings of the 10th international conference on Supercomputing
An evaluation of directory schemes for cache coherence
25 years of the international symposia on Computer architecture (selected papers)
The DASH prototype: implementation and performance
25 years of the international symposia on Computer architecture (selected papers)
Computer organization and design (2nd ed.): the hardware/software interface
Computer organization and design (2nd ed.): the hardware/software interface
An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors
IEEE Transactions on Computers
CACHET: an adaptive cache coherence protocol for distributed shared-memory systems
ICS '99 Proceedings of the 13th international conference on Supercomputing
Propeties of storage hierarchy systems with multiple page sizes and redundant data
ACM Transactions on Database Systems (TODS)
Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Design of an Adaptive Cache Coherence Protocol for Large Scale Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
The Networks of the Connection Machine CM-5
Proceedings of the First Heinz Nixdorf Symposium on Parallel Architectures and Their Efficient Use
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Bluespec: A language for hardware design, simulation, synthesis and verification Invited Talk
MEMOCODE '03 Proceedings of the First ACM and IEEE International Conference on Formal Methods and Models for Co-Design
Token coherence: decoupling performance and correctness
Proceedings of the 30th annual international symposium on Computer architecture
A New Scalable Directory Architecture for Large-Scale Multiprocessors
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
IEEE Transactions on Parallel and Distributed Systems
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Adaptive Mechanisms and Policies for Managing Cache Hierarchies in Chip Multiprocessors
Proceedings of the 32nd annual international symposium on Computer Architecture
Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling
Proceedings of the 32nd annual international symposium on Computer Architecture
An efficient cache design for scalable glueless shared-memory multiprocessors
Proceedings of the 3rd conference on Computing frontiers
Cooperative Caching for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Flexible Snooping: Adaptive Forwarding and Filtering of Snoops in Embedded-Ring Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Interconnect-Aware Coherence Protocols for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Coherence Ordering for Ring-based Chip Multiprocessors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Virtual hierarchies to support server consolidation
Proceedings of the 34th annual international symposium on Computer architecture
The Power of Priority: NoC Based Distributed Cache Coherency
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
IEEE Transactions on Computers
Low depth cache-oblivious algorithms
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Manager-client pairing: a framework for implementing coherence hierarchies
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Using in-flight chains to build a scalable cache coherence protocol
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Hierarchical Cache Consistency (HCC) is a scalable cache-consistency architecture for chip multiprocessors in which caches are shared hierarchically. HCC's cache-consistency protocol is embedded in the message-routing network that interconnects the caches, providing a distributed and scalable alternative to bus-based and directory-based consistency mechanisms. The HCC consistency protocol is "progressive" in that every message makes monotonic progress without timeouts, retries, negative acknowledgments, or retreating in any way. The latency is at most proportional to the diameter of the network. For HCC with a binary fat-tree network, the protocol requires at most 13 bits of additional state per cache line, no matter how large the system. We prove that the HCC protocol is deadlock free and provides sequential consistency.