Decoupled compressed cache: exploiting spatial locality for energy-optimized compressed caching

Authors:
Somayeh Sardashti;David A. Wood
Affiliations:
University of Wisconsin-Madison;University of Wisconsin-Madison
Venue:
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Year:
2013

Citing 26
Cited 0

Decoupled sectored caches: conciliating low tag implementation cost

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The pool of subsectors cache design

ICS '99 Proceedings of the 13th international conference on Supercomputing
A fully associative software-managed cache design

Proceedings of the 27th annual international symposium on Computer architecture
Dynamic zero compression for cache energy reduction

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Frequent value compression in data caches

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
An on-chip cache compression technique to reduce decompression overhead and design complexity

Journal of Systems Architecture: the EUROMICRO Journal
Frequent value locality and its applications

ACM Transactions on Embedded Computing Systems (TECS)
Simulating a $2M Commercial Server on a $2K PC

Computer
SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance

WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Energy efficient frequent value data cache design

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Variability in Architectural Simulations of Multi-Threaded Workloads

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Performance of Hardware Compressed Main Memory

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Adaptive Cache Compression for High-Performance Processors

Proceedings of the 31st annual international symposium on Computer architecture
A Unified Compressed Memory Hierarchy

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
A Robust Main-Memory Compression Scheme

Proceedings of the 32nd annual international symposium on Computer Architecture
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset

ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Zero-content augmented caches

Proceedings of the 23rd international conference on Supercomputing
IBM memory expansion technology (MXT)

IBM Journal of Research and Development
Structural aspects of the system/360 model 85: II the cache

IBM Systems Journal
Decoupled zero-compressed memory

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
C-pack: a high-performance microprocessor cache compression algorithm

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Adaptive granularity memory systems: a tradeoff between storage efficiency and throughput

Proceedings of the 38th annual international symposium on Computer architecture
Residue cache: a low-energy low-area L2 cache architecture via compression and partial hits

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Base-delta-immediate compression: practical data compression for on-chip caches

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
ECM: Effective Capacity Maximizer for high-performance compressed caching

HPCA '13 Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In multicore processor systems, last-level caches (LLCs) play a crucial role in reducing system energy by i) filtering out expensive accesses to main memory and ii) reducing the time spent executing in high-power states. Cache compression can increase effective cache capacity and reduce misses, improve performance, and potentially reduce system energy. However, previous compressed cache designs have demonstrated only limited benefits due to internal fragmentation and limited tags. In this paper, we propose the Decoupled Compressed Cache (DCC), which exploits spatial locality to improve both the performance and energy-efficiency of cache compression. DCC uses decoupled super-blocks and non-contiguous sub-block allocation to decrease tag overhead without increasing internal fragmentation. Non-contiguous sub-blocks also eliminate the need for energy-expensive re-compaction when a block's size changes. Compared to earlier compressed caches, DCC increases normalized effective capacity to a maximum of 4 and an average of 2.2 for a wide range of workloads. A further optimized Co-DCC (Co-Compacted DCC) design improves the average normalized effective capacity to 2.6 by co-compacting the compressed blocks in a super-block. Our simulations show that DCC nearly doubles the benefits of previous compressed caches with similar area overhead. We also demonstrate a practical DCC design based on a recent commercial LLC design.