Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
A highly configurable cache architecture for embedded systems
Proceedings of the 30th annual international symposium on Computer architecture
CQoS: a framework for enabling QoS in shared caches of CMP platforms
Proceedings of the 18th annual international conference on Supercomputing
Fair Cache Sharing and Partitioning in a Chip Multiprocessor Architecture
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Managing Wire Delay in Large Chip-Multiprocessor Caches
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Adaptive Mechanisms and Policies for Managing Cache Hierarchies in Chip Multiprocessors
Proceedings of the 32nd annual international symposium on Computer Architecture
Organizing the Last Line of Defense before Hitting the Memory Wall for CMPs
HPCA '04 Proceedings of the 10th International Symposium on High Performance Computer Architecture
A NUCA substrate for flexible CMP cache sharing
Proceedings of the 19th annual international conference on Supercomputing
Cooperative Caching for Chip Multiprocessors
Proceedings of the 33rd annual international symposium on Computer Architecture
Power model validation through thermal measurements
Proceedings of the 34th annual international symposium on Computer architecture
A dynamically tunable memory hierarchy
IEEE Transactions on Computers
Hi-index | 0.00 |
An open question in chip multiprocessors is how to organize large on-chip cache resources. Its answer must consider hit/miss latencies, energy consumption, and power dissipation. To handle this diversity of metrics, we propose the Amorphous Cache, an adaptive heterogeneous architecture for large cache memories that provides new ways of configurability. The Amorphous Cache adapts to fit the code and data by using partial array shutdowns during run-time. Its cache configuration can be resized and the set associativity changed. Four reconfiguration modes can be used, which prioritize either IPC, processor power dissipation, energy consumption of processor and DIMM memory module, or processor power2×delay product. They have been evaluated in CMPs that use private L2 caches and execute independent tasks. When one of the cores of a CMP with 4-MB L2 shared-cache is used as baseline, the maximum average improvements in IPC, power dissipation, energy consumption, and power2×delay achieved by a single core with 2-MB private L2 Amorphous Cache are 14.2%, 44.3%, 18.1%, and 29.4% respectively.