Improving performance per watt of asymmetric multi-core processors via online program phase classification and adaptive core morphing

  • Authors:
  • Rance Rodrigues;Arunachalam Annamalai;Israel Koren;Sandip Kundu

  • Affiliations:
  • University of Massachusetts at Amherst, Amherst, MA;University of Massachusetts at Amherst, Amherst, MA;University of Massachusetts at Amherst, Amherst, MA;University of Massachusetts at Amherst, Amherst, MA

  • Venue:
  • ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special section on adaptive power management for energy and temperature-aware computing systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Asymmetric multi-core processors (AMPs) have been shown to outperform symmetric ones in terms of performance and performance/watt. Improved performance and power efficiency are achieved when the program threads are matched to their most suitable cores. Since the computational needs of a program may change during its execution, the best thread to core assignment will likely change with time. We have, therefore, developed an online program phase classification scheme that allows the swapping of threads when the current needs of the threads justify a change in the assignment. The architectural differences among the cores in an AMP can never match the diversity that exists among different programs and even between different phases of the same program. Consider, for example, a program (or a program phase) that has a high instruction-level parallelism (ILP) and will exhibit high power efficiency if executed on a powerful core. We can not, however, include such powerful cores in the designed AMP, since they will remain underutilized most of the time, and they are not power efficient when the programs do not exhibit a high degree of ILP. Thus, we must expect to see program phases where the designed cores will be unable to support the ILP that the program can exhibit. We, therefore, propose in this article a dynamic morphing scheme. This scheme will allow a core to gain control of a functional unit that is ordinarily under the control of a neighboring core during periods of intense computation with high ILP. This way, we dynamically adjust the hardware resources to the current needs of the application. Our results show that combining online phase classification and dynamic core morphing can significantly improve the performance/watt of most multithreaded workloads.