Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
Line (block) size choice for CPU cache memories
IEEE Transactions on Computers
Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
An evaluation of directory schemes for cache coherence
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A cache coherence scheme with fast selective invalidation
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
The Wisconsin multicube: a new large-scale cache-coherent multiprocessor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Evaluating the performance of four snooping cache coherency protocols
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
A cache consistency protocol for multiprocessors with multistage networks
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
The Cache Coherence Protocol of the Data Diffusion Machine
PARLE '89 Proceedings of the Parallel Architectures and Languages Europe, Volume I: Parallel Architectures
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Architectural primitives for a scalable shared memory multiprocessor
SPAA '91 Proceedings of the third annual ACM symposium on Parallel algorithms and architectures
An argument against scalable cache coherency
ACM SIGARCH Computer Architecture News
Delayed consistency and its effects on the miss rate of parallel programs
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Cache Memories for Data Flow Machines
IEEE Transactions on Computers
Persistent Caching: An Implementation Technique for Complex Objects with Object Identity
IEEE Transactions on Software Engineering
The design of the M3S: a multiported shared-memory multiprocessor
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
A scalable coherent cache system with a dynamic pointing scheme
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Mapping applications onto a cache coherent multiprocessor
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The detection and elimination of useless misses in multiprocessors
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Decentralized optimal power pricing: the development of a parallel program
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
A cache coherence scheme suitable for massively parallel processors
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
A distributed shared memory multiprocessor ASURA: memory and cache architecture
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Exploiting cache affinity in software cache coherence
ICS '94 Proceedings of the 8th international conference on Supercomputing
Reducing PE/Memory Traffic in Multiprocessors by the Difference Coding of Memory Addresses
IEEE Transactions on Parallel and Distributed Systems
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
The design of RPM: an FPGA-based multiprocessor emulator
FPGA '95 Proceedings of the 1995 ACM third international symposium on Field-programmable gate arrays
A comprehensive bibliography of distributed shared memory
ACM SIGOPS Operating Systems Review
Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Using dataflow analysis techniques to reduce ownership overhead in cache coherence protocols
ACM Transactions on Programming Languages and Systems (TOPLAS)
Execution analysis of DSM applications: a distributed and scalable approach
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Data Forwarding in Scalable Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Verification techniques for cache coherence protocols
ACM Computing Surveys (CSUR)
Transactional client-server cache consistency: alternatives and performance
ACM Transactions on Database Systems (TODS)
Formalized methodology for data reuse exploration in hierarchical memory mappings
ISLPED '97 Proceedings of the 1997 international symposium on Low power electronics and design
Simulation of modern parallel systems: a CSIM-based approach
Proceedings of the 29th conference on Winter simulation
An interaction of coherence protocols and memory consistency models in DSM systems
ACM SIGOPS Operating Systems Review
Tolerating latency in multiprocessors through compiler-inserted prefetching
ACM Transactions on Computer Systems (TOCS)
Design Verification of the S3.mp Cache-Coherent Shared-Memory System
IEEE Transactions on Computers
Formal verification of complex coherence protocols using symbolic state models
Journal of the ACM (JACM)
IEEE Transactions on Parallel and Distributed Systems
An asynchronous protocol for release consistent distributed shared memory systems
SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Power and Speed-Efficient Code Transformation of Video Compression Algorithms for RISC Processors
Journal of VLSI Signal Processing Systems - Special issue on multimedia signal processing
ADir_pNB: A Cost-Effective Way to Implement Full Map Directory-Based Cache Coherence Protocols
IEEE Transactions on Computers
Cluster Computing
Efficient and scalable cache coherence schemes for shared memory hypercube multiprocessors
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
IEEE MultiMedia
Cache-Only Memory Architectures
Computer
Rapid Hardware Prototyping on RPM-2
IEEE Design & Test
Sequential Hardware Prefetching in Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Stack Evaluation of Arbitrary Set-Associative Multiprocessor Caches
IEEE Transactions on Parallel and Distributed Systems
Bounding the gain of changing the number of memory modules in shared memory multiprocessors
Nordic Journal of Computing
Formal Design of Cache Memory Protocols in IBM
Formal Methods in System Design
False Sharing Elimination by Selection of Runtime Scheduling Parameters
ICPP '97 Proceedings of the international Conference on Parallel Processing
A General Adaptive Cache Coherency-Replacement Scheme for Distributed Systems
IICS '01 Proceedings of the International Workshop on Innovative Internet Computing Systems
A Virtual Memory Model for Parallel Supercomputers
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Formal Verification of Coherence for a Shared Memory Multiprocessor Model
PaCT '01 Proceedings of the 6th International Conference on Parallel Computing Technologies
A Web Proxy Cache Coherency and Replacement Approach
WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
Minerva: An Adaptive Subblock Coherence Protocol for Improved SMP Performance
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
FMona: A Tool for Expressing Validation Techniques over Infinite State Systems
TACAS '00 Proceedings of the 6th International Conference on Tools and Algorithms for Construction and Analysis of Systems: Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS 2000
Effectiveness of hardware-based stride and sequential prefetching in shared-memory multiprocessors
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
A Cache Coherency Protocol for Optically Connected Parallel Computer Systems
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
A Hierarchical Memory Directory Scheme Via Extending SCI for Large-Scale Multiprocessors
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Analysis of Shared Memory Misses and Reference Patterns
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Locking with Different Granularities for Reads and Writes in an MVM System
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
A computer architecture with access control and cache option tags on individual instruction operands
ACM SIGARCH Computer Architecture News
Integrating applications with cache and memory management on a shared-memory multiprocessor
CASCON '92 Proceedings of the 1992 conference of the Centre for Advanced Studies on Collaborative research - Volume 1
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Exploring the energy efficiency of cache coherence protocols in single-chip multi-processors
GLSVLSI '05 Proceedings of the 15th ACM Great Lakes symposium on VLSI
Synchronization and cache coherence in computer design
Journal of Computing Sciences in Colleges
Cache coherence tradeoffs in shared-memory MPSoCs
ACM Transactions on Embedded Computing Systems (TECS)
Structure Layout Optimization for Multithreaded Programs
Proceedings of the International Symposium on Code Generation and Optimization
An Integrated Methodology for the Verification of Directory-Based Cache Protocols
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Comparison of memory write policies for NoC based multicore cache coherent systems
Proceedings of the conference on Design, automation and test in Europe
On Scalable Synchronization for Distributed Embedded Real-Time Systems
SEUS '08 Proceedings of the 6th IFIP WG 10.2 international workshop on Software Technologies for Embedded and Ubiquitous Systems
Data access in distributed simulations of multi-agent systems
Journal of Systems and Software
A load-instruction unit for pipelined processors
IBM Journal of Research and Development
Efficient shared-memory support for parallel graph reduction
Future Generation Computer Systems
Hardware-based synchronization support for shared accesses in multicore architectures
ACST '08 Proceedings of the Fourth IASTED International Conference on Advances in Computer Science and Technology
A new trace-driven shared-memory multiprocessors machine simulator
International Journal of Computers and Applications
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A workload-adaptive and reconfigurable bus architecture for multicore processors
International Journal of Reconfigurable Computing
An automatic code overlaying technique for multicores with explicitly-managed memory hierarchies
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Hi-index | 4.11 |
Schemes for cache coherence that exhibit various degrees of hardware complexity, ranging from protocols that maintain coherence in hardware, to software policies that prevent the existence of copies of shared, writable data, are surveyed. Some examples of the use of shared data are examined. These examples help point out a number of performance issues. Hardware protocols are considered. It is seen that consistency can be maintained efficiently, although in some cases with considerable hardware complexity, especially for multiprocessors with many processors. Software schemes are investigated as an alternative capable of reducing the hardware cost.