Augmenting Loop Tiling with Data Alignment for Improved Cache Performance
IEEE Transactions on Computers - Special issue on cache memory and related problems
Effects of Multithreading on Cache Performance
IEEE Transactions on Computers - Special issue on cache memory and related problems
SCIMA: A Novel Architecture for High Performance Computing
IWIA '99 Proceedings of the 1999 International Workshop on Innovative Architecture
Hi-index | 0.00 |
High latency of memory accesses is critical to the performance of shared memory multiprocessors. The technology trends indicate that this gap between processor and memory speeds is likely to increase in future. To cope with memory latency problem two software-controlled techniques have been investigated: prefetching and remote write. Prefetching is a consumer-initiated technique which moves data close to the processor before they are actually needed by explicit execution of prefetch instruction. Remote write, a producer-initiated technique moves data close to the processor estimated to be the next consumer. However these techniques can degrade the performance in the case of misprediction of future needs and/or consumers. The new method called lazy prefetching which combines good properties of prefetching and remote write techniques is presented in this paper. The experimental methodology used for performance analysis is also described.