Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
Parallel algorithms and architectures for rule-based systems
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Firefly: a multiprocessor workstation
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
Analysis of cache performance for operating systems and multiprogramming
Analysis of cache performance for operating systems and multiprogramming
Multiprocessor cache analysis using ATUM
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Memory-reference characteristics of multiprocessor applications under MACH
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Implementing a cache consistency protocol
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Using cache memory to reduce processor-memory traffic
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
Effects of cache coherency in multiprocessors
ISCA '82 Proceedings of the 9th annual symposium on Computer Architecture
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A low-overhead coherence solution for multiprocessors with private cache memories
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
An economical solution to the cache coherence problem
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
The VMP multiprocessor: initial experience, refinements, and performance evaluation
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Evaluating the performance of software cache coherence
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Analysis of cache invalidation patterns in multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Technologies for low latency interconnection switches
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Multi-level shared caching techniques for scalability in VMP-M/C
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Organization and performance of a two-level virtual-real cache hierarchy
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Introducing memory into the switch elements of multiprocessor interconnection networks
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Adaptive backoff synchronization techniques
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Analysis and Comparison of Cache Coherence Protocols for a Packet-Switched Multiprocessor
IEEE Transactions on Computers
C2MP: a cache-coherent, distributed memory multiprocessor-system
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Cache considerations for multiprocessor programmers
Communications of the ACM
Snoopy cache test-and-test-and-set without execessive bus contention
ACM SIGARCH Computer Architecture News
Cache coherence for large scale shared memory multiprocessors
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Analysis of multithreaded architectures for parallel computing
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Utilizing virtual shared memory in a topology independent, multicomputer environment
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Analysis of critical architectural and programming parameters in a hierarchical
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Modeling a circuit switched multiprocessor interconnect
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Algorithms for scalable synchronization on shared-memory multiprocessors
ACM Transactions on Computer Systems (TOCS)
LimitLESS directories: A scalable cache coherence scheme
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Combining hardware and software cache coherence strategies
ICS '91 Proceedings of the 5th international conference on Supercomputing
A software coherence scheme with the assistance of directories
ICS '91 Proceedings of the 5th international conference on Supercomputing
A high-performance, memory-based interconnection system for multicomputer environments
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Comparison of hardware and software cache coherence schemes
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Modeling the performance of limited pointers directories for cache coherence
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Implementation and performance of Munin
SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Technologies for low latency interconnection switches
ACM SIGARCH Computer Architecture News - Symposium on parallel algorithms and architectures
Cache coherence for large scale shared memory multiprocessors
ACM SIGARCH Computer Architecture News - Symposium on parallel algorithms and architectures
Comparison and analysis of software and directory coherence schemes
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Analysis of directory based cache coherence schemes with multistage networks
CSC '92 Proceedings of the 1992 ACM annual conference on Communications
Towards a shared-memory massively parallel multiprocessor
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The DASH prototype: implementation and performance
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Cache Invalidation Patterns in Shared-Memory Multiprocessors
IEEE Transactions on Computers
A performance evaluation of optimal hybrid cache coherency protocols
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Cooperative shared memory: software and hardware for scalable multiprocessor
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
A scalable coherent cache system with a dynamic pointing scheme
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons
ACM Computing Surveys (CSUR)
Cooperative shared memory: software and hardware for scalable multiprocessors
ACM Transactions on Computer Systems (TOCS)
Mechanisms for cooperative shared memory
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Anatomy of a message in the Alewife multiprocessor
ICS '93 Proceedings of the 7th international conference on Supercomputing
Dynamic switching of coherent cache protocols and its effects on Doacross loops
ICS '93 Proceedings of the 7th international conference on Supercomputing
Scan grammars: parallel attribute evaluation via data-parallelism
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
The Wisconsin Wind Tunnel: virtual prototyping of parallel computers
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The influence of random delays on parallel execution times
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Introducing a New Cache Design into Vector Computers
IEEE Transactions on Computers
Compiling for shared-memory and message-passing computers
ACM Letters on Programming Languages and Systems (LOPLAS)
An evaluation of directory protocols for medium-scale shared-memory multiprocessors
ICS '94 Proceedings of the 8th international conference on Supercomputing
Data replication for mobile computers
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Software-extended coherent shared memory: performance and cost
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Where is time spent in message-passing and shared-memory programs?
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
LCM: memory system support for parallel language implementation
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
The communication requirements of mutual exclusion
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
A comprehensive bibliography of distributed shared memory
ACM SIGOPS Operating Systems Review
Memory system performance of UNIX on CC-NUMA multiprocessors
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
An analytic study of dynamic hardware and software cache coherence strategies
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Dynamic self-invalidation: reducing coherence overhead in shared-memory multiprocessors
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
pHluid: the design of a parallel functional language implementation on workstations
Proceedings of the first ACM SIGPLAN international conference on Functional programming
Coherent network interfaces for fine-grain communication
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
The GLOW cache coherence protocol extensions for widely shared data
ICS '96 Proceedings of the 10th international conference on Supercomputing
An efficient caching support for critical sections in large-scale shared-memory multiprocessors
ICS '90 Proceedings of the 4th international conference on Supercomputing
A memory management unit and cache controller for the MARS system
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
Transactional client-server cache consistency: alternatives and performance
ACM Transactions on Database Systems (TODS)
Minimization of Communication Cost Through Caching in Mobile Environments
IEEE Transactions on Parallel and Distributed Systems
In-memory directories: eliminating the cost of directories in CC-NUMAs
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
A study of three dynamic approaches to handle widely shared data in shared-memory multiprocessors
ICS '98 Proceedings of the 12th international conference on Supercomputing
Using prediction to accelerate coherence protocols
Proceedings of the 25th annual international symposium on Computer architecture
Retrospective: the MIT Alewife machine: architecture and performance
25 years of the international symposia on Computer architecture (selected papers)
Weak ordering—a new definition
25 years of the international symposia on Computer architecture (selected papers)
The DASH prototype: implementation and performance
25 years of the international symposia on Computer architecture (selected papers)
IEEE Transactions on Computers - Special issue on cache memory and related problems
An Efficient Tree Cache Coherence Protocol for Distributed Shared Memory Multiprocessors
IEEE Transactions on Computers
Multicast snooping: a new coherence method using a multicast address network
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Weak ordering—a new definition
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
PLUS: a distributed shared-memory system
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
An empirical evaluation of two memory-efficient directory methods
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Piranha: a scalable architecture based on single-chip multiprocessing
Proceedings of the 27th annual international symposium on Computer architecture
Timestamp snooping: an approach for extending SMPs
ACM SIGPLAN Notices
Timestamp snooping: an approach for extending SMPs
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
A taxonomy-based comparison of several distributed shared memory systems
ACM SIGOPS Operating Systems Review
ADir_pNB: A Cost-Effective Way to Implement Full Map Directory-Based Cache Coherence Protocols
IEEE Transactions on Computers
Efficient and scalable cache coherence schemes for shared memory hypercube multiprocessors
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Paging tradeoffs in distributed-shared-memory multiprocessors
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
An Application-Driven Study of Multicast Communication for Write Invalidation
The Journal of Supercomputing
Design and Analysis of Cache Coherent Multistage Interconnection Networks
IEEE Transactions on Computers
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps
IEEE Transactions on Parallel and Distributed Systems
Design of an Adaptive Cache Coherence Protocol for Large Scale Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Buffer Coherency Policies in a Multisystem Data Sharing Environment
IEEE Transactions on Parallel and Distributed Systems
A Unified Formalization of Four Shared-Memory Models
IEEE Transactions on Parallel and Distributed Systems
Improving Memory Utilization in Cache Coherence Directories
IEEE Transactions on Parallel and Distributed Systems
The Impact of Parallel Loop Scheduling Strategies on Prefetching in a Shared Memory Multiprocessor
IEEE Transactions on Parallel and Distributed Systems
An Adaptive Limited Pointers Directory Scheme for Cache Coherence of Scalable Multiprocessors
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
VLSI Architecture: Past, Present, and Future
ARVLSI '99 Proceedings of the 20th Anniversary Conference on Advanced Research in VLSI
HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Relaxing Cache Coherence Protocol with QOLB Synchronizations
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
A Hierarchical Memory Directory Scheme Via Extending SCI for Large-Scale Multiprocessors
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
An effective full-map directory scheme for the sectored caches
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Evaluation of cache consistency algorithm performance
MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Random Key Predistribution Schemes for Sensor Networks
SP '03 Proceedings of the 2003 IEEE Symposium on Security and Privacy
Transactional Memory Coherence and Consistency
Proceedings of the 31st annual international symposium on Computer architecture
IEEE Transactions on Parallel and Distributed Systems
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Intelligent cache management for data grid
ACSW Frontiers '05 Proceedings of the 2005 Australasian workshop on Grid computing and e-research - Volume 44
Characterization of TCC on Chip-Multiprocessors
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
International Journal of Parallel Programming
Area-Performance Trade-offs in Tiled Dataflow Architectures
Proceedings of the 33rd annual international symposium on Computer Architecture
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
A Two-Level Load/Store Queue Based on Execution Locality
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
To Snoop or Not to Snoop: Evaluation of Fine-Grain and Coarse-Grain Snoop Filtering Techniques
Euro-Par '08 Proceedings of the 14th international Euro-Par conference on Parallel Processing
Journal of Parallel and Distributed Computing
Dealing with Traffic-Area Trade-Off in Direct Coherence Protocols for Many-Core CMPs
APPT '09 Proceedings of the 8th International Symposium on Advanced Parallel Processing Technologies
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Efficient shared-memory support for parallel graph reduction
Future Generation Computer Systems
A scalable organization for distributed directories
Journal of Systems Architecture: the EUROMICRO Journal
Cohesion: a hybrid memory model for accelerators
Proceedings of the 37th annual international symposium on Computer architecture
SPACE: sharing pattern-based directory coherence for multicore scalability
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
ATAC: a 1000-core cache-coherent processor with on-chip optical network
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A provably starvation-free distributed directory protocol
SSS'10 Proceedings of the 12th international conference on Stabilization, safety, and security of distributed systems
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Filtering directory lookups in CMPs with write-through caches
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Filtering directory lookups in CMPs
Microprocessors & Microsystems
A new hybrid directory scheme for shared memory multi-processors
CSR'06 Proceedings of the First international computer science conference on Theory and Applications
Switch-based packing technique to reduce traffic and latency in token coherence
Journal of Parallel and Distributed Computing
Balancing Programmability and Silicon Efficiency of Heterogeneous Multicore Architectures
ACM Transactions on Embedded Computing Systems (TECS)
Why on-chip cache coherence is here to stay
Communications of the ACM
Complexity-effective multicore coherence
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Optimizing software runtime systems for speculative parallelization
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Spatiotemporal Coherence Tracking
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Building expressive, area-efficient coherence directories
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Multi-grain coherence directories
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
TornadoNoC: A lightweight and scalable on-chip network architecture for the many-core era
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.05 |
The problem of cache coherence in shared-memory multiprocessors has been addressed using two basic approaches: directory schemes and snoopy cache schemes. Directory schemes have been given less attention in the past several years, while snoopy cache methods have become extremely popular. Directory schemes for cache coherence are potentially attractive in large multiprocessor systems that are beyond the scaling limits of the snoopy cache schemes. Slight modifications to directory schemes can make them competitive in performance with snoopy cache schemes for small multiprocessors. Trace driven simulation, using data collected from several real multiprocessor applications, is used to compare the performance of standard directory schemes, modifications to these schemes, and snoopy cache protocols.