MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Comparing Program Phase Detection Techniques
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Millicode in an IBM zSeries processor
IBM Journal of Research and Development
Power prediction for intel XScale® processors using performance monitoring unit events
ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Dynamic thread assignment on heterogeneous multiprocessor architectures
Proceedings of the 3rd conference on Computing frontiers
MiBench: A free, commercially representative embedded benchmark suite
WWC '01 Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop
A Flexible Heterogeneous Multi-Core Architecture
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
HASS: a scheduler for heterogeneous multicore systems
ACM SIGOPS Operating Systems Review
Real time power estimation and thread scheduling via performance counters
ACM SIGARCH Computer Architecture News
Efficient program scheduling for heterogeneous multi-core processors
Proceedings of the 46th Annual Design Automation Conference
Bias scheduling in heterogeneous multi-core architectures
Proceedings of the 5th European conference on Computer systems
A comprehensive scheduler for asymmetric multicore systems
Proceedings of the 5th European conference on Computer systems
A self-adaptive scheduler for asymmetric multi-cores
Proceedings of the 20th symposium on Great lakes symposium on VLSI
Interval-based models for run-time DVFS orchestration in superscalar processors
Proceedings of the 7th ACM international conference on Computing frontiers
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Efficient interaction between OS and architecture in heterogeneous platforms
ACM SIGOPS Operating Systems Review
Performance Per Watt Benefits of Dynamic Core Morphing in Asymmetric Multicores
PACT '11 Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques
Microvisor: a runtime architecture for thermal management in chip multiprocessors
Transactions on High-Performance Embedded Architectures and Compilers IV
Scheduling heterogeneous multi-cores through Performance Impact Estimation (PIE)
Proceedings of the 39th Annual International Symposium on Computer Architecture
Dynamic Thread Scheduling in Asymmetric Multicores to Maximize Performance-per-Watt
IPDPSW '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special section on adaptive power management for energy and temperature-aware computing systems
Scalable Thread Scheduling in Asymmetric Multicores for Power Efficiency
SBAC-PAD '12 Proceedings of the 2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing
Hi-index | 0.00 |
The importance of dynamic thread scheduling is increasing with the emergence of Asymmetric Multicore Processors (AMPs). Since the computing needs of a thread often vary during its execution, a fixed thread-to-core assignment is sub-optimal. Reassigning threads to cores (thread swapping) when the threads start a new phase with different computational needs, can significantly improve the energy efficiency of AMPs. Although identifying phase changes in the threads is not difficult, determining the appropriate thread-to-core assignment is a challenge. Furthermore, the problem of thread reassignment is aggravated by the multiple power states that may be available in the cores. To this end, we propose a novel technique to dynamically assess the program phase needs and determine whether swapping threads between core-types and/or changing the voltage/frequency levels (DVFS) of the cores will result in higher throughput/Watt. This is achieved by predicting the expected throughput/Watt of the current program phase at different voltage/frequency levels on all the available core-types in the AMP. We show that the benefits from thread swapping and DVFS are orthogonal, demonstrating the potential of the proposed scheme to achieve significant benefits by seamlessly combining the two. We illustrate our approach using a dual-core High-Performance (HP)/Low-Power (LP) AMP with two power states and demonstrate significant throughput/Watt improvement over different baselines.