The forgotten 'uncore': on the energy-efficiency of heterogeneous cores
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Memory performance at reduced CPU clock speeds: an analysis of current x86_64 processors
HotPower'12 Proceedings of the 2012 USENIX conference on Power-Aware Computing and Systems
Predicting Performance Impact of DVFS for Realistic Memory Systems
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Towards more efficient execution: a decoupled access-execute approach
Proceedings of the 27th international ACM conference on International conference on supercomputing
Mobile multicores: use them or waste them
Proceedings of the Workshop on Power-Aware Computing and Systems
Fix the code. Don't tweak the hardware: A new compiler approach to Voltage-Frequency scaling
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.00 |
We present Continuously Adaptive Dynamic Voltage/Frequency scaling in Linux systems running on Intel i7 and AMD Phenom II processors. By exploiting slack, inherent in memory-bound programs, our approach aims to improve power efficiency even when the processor does not sit idle. Our underlying methodology is based on a simple first-order processor performance model in which frequency scaling is expressed as a change (in cycles) of the main memory latency. Utilizing available monitoring hardware we show that our model is powerful enough to i) predict with reasonable accuracy the effect of frequency scaling (in terms of performance loss) and ii) predict the core energy under different V/f combinations. To validate our approach we perform highly accurate, fine-grained power measurements directly on the off-chip voltage regulators. We use our model to implement various DVFS policies as Linux "green" governors to continuously optimize for various power-efficiency metrics such as EDP or ED^2P, or achieve energy savings with a user-specified limit on performance loss. Our evaluation shows that, for SPEC2006 workloads, our governors achieve dynamically the same optimal EDP or ED$^2$P (within 2% on avg.) as an exhaustive search of all possible frequencies. Energy savings can reach up to 56% in memory-bound workloads with corresponding improvements of about 55% for EDP or ED$^2$P.