Decoupled sectored caches: conciliating low tag implementation cost
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Bus-invert coding for low-power I/O
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Low-swing interconnect interface circuits
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Dynamic zero compression for cache energy reduction
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Cache decay: exploiting generational behavior to reduce cache leakage power
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Interconnect-power dissipation in a microprocessor
Proceedings of the 2004 international workshop on System level interconnect prediction
Leakage Power Optimization Techniques for Ultra Deep Sub-Micron Multi-Level Caches
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Microarchitectural Wire Management for Performance and Power in Partitioned Architectures
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Exploiting Low Entropy to Reduce Wire Delay
IEEE Computer Architecture Letters
New Generation of Predictive Technology Model for Sub-45nm Design Exploration
ISQED '06 Proceedings of the 7th International Symposium on Quality Electronic Design
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Energy-Aware Interconnect Optimization for a Coarse Grained Reconfigurable Processor
VLSID '08 Proceedings of the 21st International Conference on VLSI Design
Memory mapped ECC: low-cost error protection for last level caches
Proceedings of the 36th annual international symposium on Computer architecture
Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Hi-index | 0.00 |
Increasing cache sizes in modern microprocessors require long wires to connect cache arrays to processor cores. As a result, the last-level cache (LLC) has become a major contributor to processor energy, necessitating techniques to increase the energy efficiency of data exchange over LLC interconnects. This paper presents an energy-efficient data exchange mechanism using synchronized counters. The key idea is to represent information by the delay between two consecutive pulses on a set of wires, which makes the number of state transitions on the interconnect independent of the data patterns, and significantly lowers the activity factor. Simulation results show that the proposed technique reduces overall processor energy by 7%, and the L2 cache energy by 1.81× on a set of sixteen parallel applications. This efficiency gain is attained at a cost of less than 1% area overhead to the L2 cache, and a 2% delay overhead to execution time.