Memory access buffering in multiprocessors
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
A cache coherence approach for large multiprocessor systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Memory-reference characteristics of multiprocessor applications under MACH
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Analysis of cache invalidation patterns in multiprocessors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Evaluating the performance of four snooping cache coherency protocols
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Munin: distributed shared memory based on type-specific memory coherence
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
LimitLESS directories: A scalable cache coherence scheme
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Combining hardware and software cache coherence strategies
ICS '91 Proceedings of the 5th international conference on Supercomputing
Comparison of hardware and software cache coherence schemes
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The Stanford Dash Multiprocessor
Computer
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Willow: a scalable shared memory multiprocessor
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons
ACM Computing Surveys (CSUR)
Limitations of cache prefetching on a bus-based multiprocessor
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
An economical solution to the cache coherence problem
ISCA '84 Proceedings of the 11th annual international symposium on Computer architecture
Analytical Prediction of Performance for Cache Coherence Protocols
IEEE Transactions on Computers
IEEE Transactions on Parallel and Distributed Systems
A Compiler-Assisted Scheme for Adaptive Cache Coherence Enforcement
PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Proceedings of the 30th annual international symposium on Computer architecture
Coherence decoupling: making use of incoherence
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Cache coherence schemes that dynamically adapt to memory referencing patterns have been proposed to improve coherence enforcement in shared-memory multiprocessors. By using only run-time information, however, these existing schemes are incapable of looking ahead in the memory referencing stream. We present a combined hardware-software strategy that uses the predictive capability of the compiler to select updating or invalidating for each write reference. To determine the potential performance improvement that can be achieved with this optimization, three different levels of compiler capabilities are examined. Simulations using memory traces show that with an ideal compiler, this optimization can potentially reduce the miss ratio by 0.4% to 15% compared to an invalidating-only scheme, while reducing the generated network traffic by 13% to 94 % compared to an updating-only scheme. In addition, this optimization can potentially reduce the miss ratio by up to 13%, while reducing the generated network traffic by up to 92%, compared to a dynamic adaptive scheme. Furthermore, performance can be potentially improved even with a compiler capable of performing only imprecise array subscript analysis and no interprocedural analysis.Index Terms驴Compiler optimization, update, invalidate, directory, cache coherence, shared-memory, multiprocessor.