Power efficiency for variation-tolerant multicore processors
Proceedings of the 2006 international symposium on Low power electronics and design
Impact of process variations on multicore performance symmetry
Proceedings of the conference on Design, automation and test in Europe
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Throughput of multi-core processors under thermal constraints
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Characterizing chip-multiprocessor variability-tolerance
Proceedings of the 45th annual Design Automation Conference
Amdahl's Law in the Multicore Era
Computer
Workload-adaptive process tuning strategy for power-efficient multi-core processors
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Power optimization for multimedia transcoding on multicore servers
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Proceedings of the 2010 Asia and South Pacific Design Automation Conference
Proceedings of the 16th Asia and South Pacific Design Automation Conference
Analyzing throughput of power and thermal-constraint multicore processor under NBTI effect
Proceedings of the 21st edition of the great lakes symposium on Great lakes symposium on VLSI
Predictive Model-Based Thermal Management for Network Applications
Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
Traffic-aware power optimization for network applications on multicore servers
Proceedings of the 49th Annual Design Automation Conference
PGCapping: exploiting power gating for power capping and core lifetime balancing in CMPs
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Cherry-picking: exploiting process variations in dark-silicon homogeneous chip multi-processors
Proceedings of the Conference on Design, Automation and Test in Europe
Agile, efficient virtualization power management with low-latency server power states
Proceedings of the 40th Annual International Symposium on Computer Architecture
Architecturally homogeneous power-performance heterogeneous multicore systems
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Coordinated energy management in heterogeneous processors
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Virtually-aged sampling DMR: unifying circuit failure prediction and circuit failure detection
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Crank it up or dial it down: coordinated multiprocessor frequency and folding control
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Process variability from a range of sources is growing as technology scales below 65nm, resulting in increasingly nonuniform transistor delay and leakage power both within a die and across dies. As a result, the negative impact of process variations on the maximum operating frequency and the total power consumption of a processor is expected to worsen. Meanwhile, manufacturers have integrated more cores in a single die, substantially improving the throughput of a processor running highly-parallel applications. However, many existing applications do not have high enough parallelism to exploit multiple cores in a die. In this paper, first, we analyze the throughput impact of applying per-core power gating and dynamic voltage and frequency scaling to power- and thermal-constrained multicore processors. To optimize the throughput of the multicore processors running applications with limited parallelism, we exploit power- and thermal-headroom resulted from power-gated idle cores, allowing active cores to increase operating frequency through supply voltage scaling. Our analysis using a 32nm predictive technology model shows that optimizing the number of active cores and operating frequency within power, thermal, and supply voltage scaling limits improves the throughput of a 16-core processor by ~16%. Furthermore, we extend our throughput analysis and optimization to consider the impact of within-die process variations leading to core-to-core frequency (and leakage power) variations in a multicore processor. Our analysis shows that exploiting core-to-core frequency variations improves the throughput of a 16-core processor by ~75%.