Two techniques for improving performance on bus-based multiprocessors

Authors:
C. Anderson;J.-L. Baer
Affiliations:
-;-
Venue:
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Year:
1995

Citing 12
Cited 8

Cache coherence protocols: evaluation using a multiprocessor simulation model

ACM Transactions on Computer Systems (TOCS)
Coherency for multiprocessor virtual address caches

ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Logic verification algorithms and their parallel implementation

DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor

ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Evaluating the performance of four snooping cache coherency protocols

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
SPLASH: Stanford parallel applications for shared-memory

ACM SIGARCH Computer Architecture News
Adjustable block size coherent caches

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache write policies and performance

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Decoupled sectored caches: conciliating low tag implementation cost

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The Cerberus Multiprocessor Simulator

Proceedings of the Third SIAM Conference on Parallel Processing for Scientific Computing
Dynamic decentralized cache schemes for mimd parallel processors

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A low-overhead coherence solution for multiprocessors with private cache memories

ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture

Boosting the performance of hybrid snooping cache protocols

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A Performance Study on Bounteous Transfer in Multiprocessor Sectored Caches

The Journal of Supercomputing - Special issue: high performance computing systems
Achieving High Performance in Bus-Based Shared-Memory Multiprocessors

IEEE Concurrency
Minerva: An Adaptive Subblock Coherence Protocol for Improved SMP Performance

ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Cache Injection: A Novel Technique for Tolerating Memory Latency in Bus-Based SMPs

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Two Adaptive Hybrid Cache Coherency Protocols

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Analysis of Shared Memory Misses and Reference Patterns

ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Service based communication for MPSoC platform-SegBus

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We explore two techniques for reducing memory latency in bus-based multiprocessors. The first one, designed for sector caches, is a snoopy cache coherence protocol that uses a large transfer block to take advantage of spatial locality, while using a small coherence block (called a subblock to avoid false sharing). The second technique is read snarfing (or read broadcasting), in which all caches can acquire data transmitted in response to a read request to update invalid blocks in their own cache. We evaluated the two techniques by simulating 6 applications that exhibit a variety of reference patterns. We compared the performance of the new protocol against that of the Illinois protocol with both small and large block sizes and found that it was effective in reducing memory latency and providing more consistent, good results than the Illinois protocol with a given line size. Read snarfing also improved performance mostly for protocols that use large line sizes.