Cache coherence protocols: evaluation using a multiprocessor simulation model
ACM Transactions on Computer Systems (TOCS)
A class of compatible cache consistency protocols and their support by the IEEE futurebus
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Line (block) size choice for CPU cache memories
IEEE Transactions on Computers
Firefly: a multiprocessor workstation
ASPLOS II Proceedings of the second international conference on Architectual support for programming languages and operating systems
The Balance Multiprocessor System
IEEE Micro
Measuring VAX 8800 performance with a histogram hardware monitor
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
An evaluation of directory schemes for cache coherence
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
A cache coherence approach for large multiprocessor systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Memory-reference characteristics of multiprocessor applications under MACH
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Analysis of cache invalidation patterns in multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
The effect of sharing on the cache and bus performance of parallel programs
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Evaluating the performance of four snooping cache coherency protocols
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
LimitLESS directories: A scalable cache coherence scheme
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons
ACM Computing Surveys (CSUR)
Analysis of cache performance in vector processors and multiprocessors
Analysis of cache performance in vector processors and multiprocessors
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Implementing a cache consistency protocol
ISCA '85 Proceedings of the 12th annual international symposium on Computer architecture
ACM Computing Surveys (CSUR)
Analysis of Multiprocessor Memory Refernce Behavior
ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
Dynamic decentralized cache schemes for mimd parallel processors
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
A low-overhead coherence solution for multiprocessors with private cache memories
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Absolute and Comparative Performance of Cache Consistency Algorithms
Absolute and Comparative Performance of Cache Consistency Algorithms
Efficient Analysis of Caching Systems
Efficient Analysis of Caching Systems
SFCS '86 Proceedings of the 27th Annual Symposium on Foundations of Computer Science
Cache system design in the tightly coupled multiprocessor system
AFIPS '76 Proceedings of the June 7-10, 1976, national computer conference and exposition
Evaluation techniques for storage hierarchies
IBM Systems Journal
CPU Cache Prefetching: Timing Evaluation of Hardware Implementations
IEEE Transactions on Computers
Trace Factory: Generating Workloads for Trace-Driven Simulation of Shared-Bus Multiprocessors
IEEE Parallel & Distributed Technology: Systems & Technology
Minerva: An Adaptive Subblock Coherence Protocol for Improved SMP Performance
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Analysis of Shared Memory Misses and Reference Patterns
ICCD '00 Proceedings of the 2000 IEEE International Conference on Computer Design: VLSI in Computers & Processors
Hi-index | 0.00 |
As more computer systems turn to multiprocessing for improved performance, additional research is needed to evaluate and improve the performance of cache consistency protocols. In this study, we use trace-driven simulation to examine the performance of several consistency protocols, including some new adaptive protocols which have not been examined in prior research. This study uses a wider variety of traces than have been previously analyzed, including some production applications from a vector mini-supercomputer system, and presents a wider variety of analyses than have been previously presented for a given workload. We find that the sharing characteristics of application programs have a large bearing on the relative performance of the different protocols. Update-based protocols outperform invalidate-based protocols when accesses so shared data are highly interleaved among different processors (fine-grain sharing), while invalidate-based protocols are superior if one processor performs all accesses to shared data over long periods of time (coarse-grain sharing). Adaptive protocols provide the best overall performance across all applications; we present a new protocol called Update-Once, which yields the highest average performance. In even the best cases, however, estimated processor utilizations are unacceptably low due to the overhead to maintain consistent caches. To extract good performance from multiprocessor systems, existing application programs must be recoded to reduce sharing between processors.