Decoupled sectored caches: conciliating low tag implementation cost
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A fully associative software-managed cache design
Proceedings of the 27th annual international symposium on Computer architecture
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Dynamic zero compression for cache energy reduction
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Frequent value compression in data caches
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
An on-chip cache compression technique to reduce decompression overhead and design complexity
Journal of Systems Architecture: the EUROMICRO Journal
Symbiotic jobscheduling with priorities for a simultaneous multithreading processor
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Frequent value locality and its applications
ACM Transactions on Embedded Computing Systems (TECS)
Improving System Performance with Compressed Memory
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Performance of Hardware Compressed Main Memory
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Adaptive Compressed Caching: Design and Implementation
SBAC-PAD '03 Proceedings of the 15th Symposium on Computer Architecture and High Performance Computing
Adaptive Cache Compression for High-Performance Processors
Proceedings of the 31st annual international symposium on Computer architecture
A Unified Compressed Memory Hierarchy
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
A Robust Main-Memory Compression Scheme
Proceedings of the 32nd annual international symposium on Computer Architecture
RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence
Proceedings of the 32nd annual international symposium on Computer Architecture
Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking
Proceedings of the 32nd annual international symposium on Computer Architecture
Analysis of the O-GEometric History Length Branch Predictor
Proceedings of the 32nd annual international symposium on Computer Architecture
Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
The case for compressed caching in virtual memory systems
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
A Framework for Coarse-Grain Optimizations in the On-Chip Memory Hierarchy
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Decoupled zero-compressed memory
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
A unified approach to eliminate memory accesses early
CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
HICAMP: architectural support for efficient concurrency-safe shared structured data access
ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Residue cache: a low-energy low-area L2 cache architecture via compression and partial hits
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Base-delta-immediate compression: practical data compression for on-chip caches
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Linearly compressed pages: a main memory compression framework with low complexity and low latency
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Data filter cache with word selection cache for low power embedded processor
Proceedings of the 2013 Research in Adaptive and Convergent Systems
Decoupled compressed cache: exploiting spatial locality for energy-optimized compressed caching
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Linearly compressed pages: a low-complexity, low-latency main memory compression framework
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
It has been observed that some applications manipulate large amounts of null data. Moreover these zero data often exhibit high spatial locality. On some applications more than 20% of the data accesses concern null data blocks. Representing a null block in a cache on a standard cache line appears as a waste of resources. In this paper, we propose the Zero-Content Augmented cache, the ZCA cache. A ZCA cache consists of a conventional cache augmented with a specialized cache for memorizing null blocks, the Zero-Content cache or ZC cache. In the ZC cache, the data block is represented by its address tag and a validity bit. Moreover, as null blocks generally exhibit high spatial locality, several null blocks can be associated with a single address tag in the ZC cache. For instance, a ZC cache mapping 32MB of zero 64-byte lines uses less than 80KB of storage. Decompression of a null block is very simple, therefore read access time on the ZCA cache is in the same range as the one of a conventional cache. On applications manipulating large amount of null data blocks, such a ZC cache allows to significantly reduce the miss rate and memory traffic, and therefore to increase performance for a small hardware overhead. In particular, the write-back traffic on null blocks is limited. For applications with a low null block rate, no performance loss is observed.