Dynamic adaptive virtual core mapping to improve power, energy, and performance in multi-socket multicores

Authors:
Chang S. Bae;Lei Xia;Peter Dinda;John Lange
Affiliations:
Northwestern University, Evanston, IL, USA;Northwestern University, Evanston, IL, USA;Northwestern University, Evanston, IL, USA;University of Pittsburgh, Pittsburgh, PA, USA
Venue:
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Year:
2012

Citing 25
Cited 0

SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance

WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Intel Virtualization Technology

Computer
IPC Considered Harmful for Multiprocessor Workloads

IEEE Micro
Online power-performance adaptation of multithreaded programs using hardware event-based prediction

Proceedings of the 20th annual international conference on Supercomputing
Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Efficient operating system scheduling for performance-asymmetric multi-core architectures

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
System-Level Performance Metrics for Multiprogram Workloads

IEEE Micro
Using OS Observations to Improve Performance in Multicore Systems

IEEE Micro
Prediction-Based Power-Performance Adaptation of Multithreaded Scientific Codes

IEEE Transactions on Parallel and Distributed Systems
The PARSEC benchmark suite: characterization and architectural implications

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Identifying energy-efficient concurrency levels using machine learning

CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Thread motion: fine-grained power management for multi-core systems

Proceedings of the 36th annual international symposium on Computer architecture
Enabling high-performance memory migration for multithreaded applications on LINUX

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Addressing shared resource contention in multicore processors via scheduling

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Resource-conscious scheduling for energy efficiency on multicore processors

Proceedings of the 5th European conference on Computer systems
Virtual machine power metering and provisioning

Proceedings of the 1st ACM symposium on Cloud computing
Scalable thread scheduling and global power management for heterogeneous many-core architectures

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A comparison of high-level full-system power models

HotPower'08 Proceedings of the 2008 conference on Power aware computing and systems
Accurate online power estimation and automatic battery behavior based power model generation for smartphones

CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Minimal-overhead virtualization of a large scale supercomputer

Proceedings of the 7th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Self-constructive high-rate system energy modeling for battery-powered mobile systems

MobiSys '11 Proceedings of the 9th international conference on Mobile systems, applications, and services
Dark silicon and the end of multicore scaling

Proceedings of the 38th annual international symposium on Computer architecture
A case for NUMA-aware contention management on multicore systems

USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Characterizing the Performance of Parallel Applications on Multi-socket Virtual Machines

CCGRID '11 Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
Toward Dark Silicon in Servers

IEEE Micro

Quantified Score

Hi-index	0.00

Visualization

Abstract

Consider a multithreaded parallel application running inside a multicore virtual machine context that is itself hosted on a multi-socket multicore physical machine. How should the VMM map virtual cores to physical cores? We compare a local mapping, which compacts virtual cores to processor sockets, and an interleaved mapping, which spreads them over the sockets. Simply choosing between these two mappings exposes clear tradeoffs between performance, energy, and power. We then describe the design, implementation, and evaluation of a system that automatically and dynamically chooses between the two mappings. The system consists of a set of efficient online VMM-based mechanisms and policies that (a) capture the relevant characteristics of memory reference behavior, (b) provide a policy and mechanism for configuring the mapping of virtual machine cores to physical cores that optimizes for power, energy, or performance, and (c) drive dynamic migrations of virtual cores among local physical cores based on the workload and the currently specified objective. Using these techniques we demonstrate that the performance of SPEC and PARSEC benchmarks can be increased by as much as 66%, energy reduced by as much as 31%, and power reduced by as much as 17%, depending on the optimization objective.