Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
Computer
Line (block) size choice for CPU cache memories
IEEE Transactions on Computers
Cache memory optimization to reduce processor/memory traffic
Advances in VLSI and Computer Systems
Logic verification algorithms and their parallel implementation
DAC '87 Proceedings of the 24th ACM/IEEE Design Automation Conference
Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
Firefly: A Multiprocessor Workstation
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The effect of sharing on the cache and bus performance of parallel programs
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Reduced instruction set computers
Communications of the ACM - Special section on computer architecture
Implementing a cache consistency protocol
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
Cache memory performance in a unix enviroment
ACM SIGARCH Computer Architecture News
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Aspects of Cache Memory and Instruction
Aspects of Cache Memory and Instruction
SPUR Memory System Architecture
SPUR Memory System Architecture
Simulation analysis of data-sharing in shared memory multiprocessors
Simulation analysis of data-sharing in shared memory multiprocessors
Comparative evaluation of latency reducing and tolerating techniques
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Simplicity Versus Accuracy in a Model of Cache Coherency Overhead
IEEE Transactions on Computers
Design choices for the TOP-1 multiprocessor workstation
IBM Journal of Research and Development
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons
ACM Computing Surveys (CSUR)
An evaluation of a compiler optimization for improving the performance of a coherence directory
ICS '94 Proceedings of the 8th international conference on Supercomputing
Data replication for mobile computers
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
IEEE Transactions on Parallel and Distributed Systems
Boosting the performance of hybrid snooping cache protocols
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Memory organization in multi-channel optical networks: NUMA and COMA revisited
ICS '96 Proceedings of the 10th international conference on Supercomputing
A memory management unit and cache controller for the MARS system
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
Minimization of Communication Cost Through Caching in Mobile Environments
IEEE Transactions on Parallel and Distributed Systems
CACHET: an adaptive cache coherence protocol for distributed shared-memory systems
ICS '99 Proceedings of the 13th international conference on Supercomputing
A program-driven simulation model of an MIMD multiprocessor
ANSS '91 Proceedings of the 24th annual symposium on Simulation
IEEE Transactions on Parallel and Distributed Systems
Competitive randomized algorithms for non-uniform problems
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
Synchronization with multiprocessor caches
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The effect of seance communication on multiprocessing systems
ACM Transactions on Computer Systems (TOCS)
Trace Factory: Generating Workloads for Trace-Driven Simulation of Shared-Bus Multiprocessors
IEEE Parallel & Distributed Technology: Systems & Technology
Performance Analysis of Buffer Coherency Policies in a Multisystem Data Sharing Environment
IEEE Transactions on Parallel and Distributed Systems
Minerva: An Adaptive Subblock Coherence Protocol for Improved SMP Performance
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
A Compiler-Assisted Scheme for Adaptive Cache Coherence Enforcement
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
On cache memory hierarchy for Chip-Multiprocessor
ACM SIGARCH Computer Architecture News
Software cache coherence for large scale multiprocessors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Two techniques for improving performance on bus-based multiprocessors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Two Adaptive Hybrid Cache Coherency Protocols
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Bus-based COMA-reducing traffic in shared-bus multiprocessors
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Evaluation of cache consistency algorithm performance
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Analysis of Shared Memory Misses and Reference Patterns
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Stateful distributed interposition
ACM Transactions on Computer Systems (TOCS)
Characterization of L3 cache behavior of SPECjAppServer2002 and TPC-C
Proceedings of the 19th annual international conference on Supercomputing
Reducing the Write Traffic for a Hybrid Cache Protocol
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Speeding-up multiprocessors running DBMS workloads through coherence protocols
International Journal of High Performance Computing and Networking
Autonomic query parallelization using non-dedicated computers: an evaluation of adaptivity options
The VLDB Journal — The International Journal on Very Large Data Bases
A new hybrid directory scheme for shared memory multi-processors
CSR'06 Proceedings of the First international computer science conference on Theory and Applications
An online algorithm optimally self-tuning to congestion for power management problems
WAOA'11 Proceedings of the 9th international conference on Approximation and Online Algorithms
Synchronised range queries in distributed simulations of multiagent systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Interest management for distributed virtual environments: A survey
ACM Computing Surveys (CSUR)
Bandwidth Adaptive Cache Coherence Optimizations for Chip Multiprocessors
International Journal of Parallel Programming
Hi-index | 0.00 |
Write-invalidate and write-broadcast coherency protocols have been criticized for being unable to achieve good bus performance across all cache configurations. In particular, write-invalidate performance can suffer as block size increases; and large cache sizes will hurt write-broadcast. Read-broadcast and competitive snooping extensions to the protocols have been proposed to solve each problem.Our results indicate that the benefits of the extensions are limited. Read-broadcast reduces the number of invalidation misses, but at a high cost in processor lockout from the cache. The net effect can be an increase in total execution cycles. Competitive snooping benefits only those programs with high per-processor locality of reference to shared data. For programs characterized by inter-processor contention for shared addresses, competitive snooping can degrade performance by causing a slight increase in bus utilization and total execution time.