Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Computer
Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Distributing Hot-Spot Addressing in Large-Scale Multiprocessors
IEEE Transactions on Computers
On the inclusion properties for multi-level cache hierarchies
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Analysis of bus hierarchies for multiprocessors
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Efficient synchronization primitives for large-scale cache-coherent multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Synchronization without contention
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Propeties of storage hierarchy systems with multiple page sizes and redundant data
ACM Transactions on Database Systems (TODS)
Weak ordering—a new definition
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Limits on Interconnection Network Performance
IEEE Transactions on Parallel and Distributed Systems
A cache coherence scheme suitable for massively parallel processors
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Multicast snooping: a new coherence method using a multicast address network
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
A Hierarchical Memory Directory Scheme Via Extending SCI for Large-Scale Multiprocessors
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
SigNet: network-on-chip filtering for coarse vector directories
Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
Multis, shared-memory multiprocessors that are implemented with single buses andsnooping cache protocols are inherently limited to a small number of processors, and, assystems grow beyond a single bus, the bandwidth requirements of broadcast operationslimit scalability. Hardware support to provide cache coherence without the use ofbroadcast can become very expensive. An approach to maintaining coherence usingapproximate information held in special-purpose caches called pruning-caches thatprovides robust performance over a wide range of workloads is presented. Thepruning-cache approach is compared to the more conventional inclusion cache forproviding multilevel inclusion (MLI) in the cache hierarchy. It is shown thatpruning-caches are more cost-effective and more robust. Using both analysis andsimulation, it is also shown that the k-ary n-cube topology provides scalable,bottleneck-free communication for uniform, point-to-point traffic.