Analysis of interprocedural side effects in a parallel programming environment
Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Comparison of hardware and software cache coherence schemes
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Detecting redundant accesses to array data
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Automatic software cache coherence through vectorization
ICS '92 Proceedings of the 6th international conference on Supercomputing
Life span strategy—a compiler-based approach to cache coherence
ICS '92 Proceedings of the 6th international conference on Supercomputing
Execution-driven tools for parallel simulation of parallel architectures and applications
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Improving locality and parallelism in nested loops
Improving locality and parallelism in nested loops
Automatic array privatization and demand-driven symbolic analysis
Automatic array privatization and demand-driven symbolic analysis
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A compiler-directed cache coherence scheme with improved intertask locality
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Design and Analysis of a Scalable Cache Coherence Scheme Based on Clocks and Timestamps
IEEE Transactions on Parallel and Distributed Systems
An Exact Method for Analysis of Value-based Array Data Dependences
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Hardware and compiler support for cache coherence in large-scale shared-memory multiprocessors
Hardware and compiler support for cache coherence in large-scale shared-memory multiprocessors
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
In this paper, we develop a compiler algorithm for detecting references to stale data in shared-memory multiprocessors. The algorithm consists of two key analysis techniques, stale reference detection and locality preserving analysis. While the stale reference detection finds the memory reference patterns that may violate cache coherence, the locality preserving analysis minimizes the number of such stale references by analyzing both temporal and spatial reuses. By computing the regions referenced by arrays inside loops, we extend the previous scalar algorithms for more precise analysis. We have implemented the algorithm on the Polaris parallelizing compiler, and using execution-driven simulations on Perfect Club benchmarks we demonstrate how unnecessary cache misses can be eliminated by the automatic stale reference detection.