Cache replacement with dynamic exclusion
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Stride directed prefetching in scalar processors
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
A case for two-way skewed-associative caches
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Efficient simulation of caches under optimal replacement with applications to miss characterization
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Decoupled sectored caches: conciliating low tag implementation cost
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A study of integrated prefetching and caching strategies
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
A quantitative analysis of loop nest locality
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
Journal of the ACM (JACM)
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
An Algorithm for Optimally Exploiting Spatial and Temporal Locality in Upper Memory Levels
IEEE Transactions on Computers - Special issue on cache memory and related problems
Designing a Modern Memory Hierarchy with Hardware Prefetching
IEEE Transactions on Computers
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Constructing optimal XOR-functions to minimize cache conflict misses
ARCS'08 Proceedings of the 21st international conference on Architecture of computing systems
Hi-index | 0.02 |
Recent work has demonstrated that, cache space is often poorly utilized. However, no previous work has yet demonstrated upper bounds on what a cache or local memory could achieve when exploiting both spatial and temporal locality. Belady's MIN algorithm does yield an upper bound, but exploits only temporal locality. In this article, we present an optimal replacement algorithm for local memory that exploits temporal locality and spatial locality simultaneously. This algorithm is an extension of Belady's algorithm. We prove the optimality of this new algorithm with respect to minimizing misses, and we show experimentally that the algorithm produces nearly minimum memory traffic on the SPEC95 benchmarks. Like Belady's algorithm, our algorithm requires the entire program trace. It selects replacement victims and the number of words it fetches at once based on future accesses. Many different spatial locality strategies can be implemented with this algorithm. With an optimal strategy, the algorithm yields an upper bound that enables us to evaluate alternative implementations to today's caches. We further demonstrate the utility of this algorithm as an analysis tool by evaluating several intermediate strategies between cache and optimal to highlight the limitations of the cache line paradigm using the SPEC95 benchmarks.