Morphable Cache Architectures: Potential Benefits

Authors:
I. Kadayif;M. Kandemir;N. Vijaykrishnan;M. J. Irwin;J. Ramanujam
Affiliations:
Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA;Pennsylvania State University, University Park, PA
Venue:
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
Year:
2001

Citing 22
Cited 2

Strategies for cache and local memory management by global program transformation

Journal of Parallel and Distributed Computing - Special Issue on Languages, Compilers and environments for Parallel Programming
Shade: a fast instruction-set simulator for execution profiling

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Cache interference phenomena

SIGMETRICS '94 Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Internal organization of the Alpha 21164, a 300-MHz 64-bit quad-issue CMOS RISC microprocessor

Digital Technical Journal - Special 10th anniversary issue
Cache design trade-offs for power and performance optimization: a case study

ISLPED '95 Proceedings of the 1995 international symposium on Low power design
A quantitative analysis of loop nest locality

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Adapting cache line size to application behavior

ICS '99 Proceedings of the 13th international conference on Supercomputing
Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Automatic and efficient evaluation of memory hierarchies for embedded systems

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Selective cache ways: on-demand cache resource allocation

Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Run-Time Cache Bypassing

IEEE Transactions on Computers
Energy-driven integrated hardware-software optimizations using SimplePower

Proceedings of the 27th annual international symposium on Computer architecture
Reconfigurable caches and their application to media processing

Proceedings of the 27th annual international symposium on Computer architecture
Gated-Vdd: a circuit technique to reduce leakage in deep-submicron cache memories

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Memory system energy (poster session): influence of hardware-software optimizations

ISLPED '00 Proceedings of the 2000 international symposium on Low power electronics and design
Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design

Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design
Design of High-Performance Microprocessor Circuits

Design of High-Performance Microprocessor Circuits
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
Architectural adaptation for application-specific locality optimizations

ICCD '97 Proceedings of the 1997 International Conference on Computer Design (ICCD '97)
Hardware techniques to improve the performance of the processor/memory interface

Hardware techniques to improve the performance of the processor/memory interface

Compiler-directed cache polymorphism

Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems
Analyzing data reuse for cache reconfiguration

ACM Transactions on Embedded Computing Systems (TECS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computer architects have tried to mitigate the consequences of high memory latencies using a variety techniques. An example of these techniques is multi-level caches to counteract the latency that results from having a memory that is slower than the processor. Recent research has demonstrated that compiler optimizations that modify data layouts and restructure computation can be successful in improving memory system performance. However, in many cases, working with a fixed cache configuration prevents the application/compiler from obtaining the maximum performance. In addition, prompted by demands in portability, long battery life, and low-cost packaging, the computer industry has started viewing energy and power as decisive design factors, along with performance and cost. This makes the job of the compiler/user even more difficult as one needs to strike a balance between low power/energy consumption and high performance. Consequently, adapting the code to the underlying cache/memory hierarchy is becoming more and more difficult.In this paper, we take an alternate approach and attempt to adapt the cache architecture to the software needs. We focus on array-dominated applications and measure the potential benefits that could be gained from a morphable (reconfigurable) cache architecture. Our results show that not only different applications work best with different cache configurations, but also that different loop nests in a given application demand different configurations. Our results also indicate that the most suitable cache configuration for a given application or a single nest depends strongly on the objective function being optimized. For example, minimizing cache memory energy requires a different cache configuration for each nest than an objective which tries to minimize the overall memory system energy. Based on our experiments, we conclude that fine-grain (loop nest-level) cache configuration management is an important step for a solution to the challenging architecture/software tradeoffs awaiting system designers in the future.