Scheduling heterogeneous multi-cores through Performance Impact Estimation (PIE)

Authors:
Kenzo Van Craeynest;Aamer Jaleel;Lieven Eeckhout;Paolo Narvaez;Joel Emer
Affiliations:
Ghent University, Ghent, Belgium;Intel Corporation, VSSAD, Hudson, MA;Ghent University, Ghent, Belgium;Intel Corporation, VSSAD, Hudson, MA;Intel Corporation, VSSAD, Hudson, MA and MIT, Cambridge, MA
Venue:
Proceedings of the 39th Annual International Symposium on Computer Architecture
Year:
2012

Citing 22
Cited 18

Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Understanding some simple processor-performance limits

IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Symbiotic jobscheduling for a simultaneous multithreaded processor

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-Order Processors

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance

Proceedings of the 31st annual international symposium on Computer architecture
Microarchitecture Optimizations for Exploiting Memory-Level Parallelism

Proceedings of the 31st annual international symposium on Computer architecture
Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Scheduling for heterogeneous processors in server systems

Proceedings of the 2nd conference on Computing frontiers
Introduction to the cell multiprocessor

IBM Journal of Research and Development - POWER5 and packaging
Efficient operating system scheduling for performance-asymmetric multi-core architectures

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
System-Level Performance Metrics for Multiprogram Workloads

IEEE Micro
HASS: a scheduler for heterogeneous multicore systems

ACM SIGOPS Operating Systems Review
A mechanistic performance model for superscalar out-of-order processors

ACM Transactions on Computer Systems (TOCS)
Efficient program scheduling for heterogeneous multi-core processors

Proceedings of the 46th Annual Design Automation Conference
Age based scheduling for asymmetric multiprocessors

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Bias scheduling in heterogeneous multi-core architectures

Proceedings of the 5th European conference on Computer systems
High performance cache replacement using re-reference interval prediction (RRIP)

Proceedings of the 37th annual international symposium on Computer architecture
Scalable thread scheduling and global power management for heterogeneous many-core architectures

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Efficient interaction between OS and architecture in heterogeneous platforms

ACM SIGOPS Operating Systems Review
Efficiently exploiting memory level parallelism on asymmetric coupled cores in the dark silicon era

ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
CRUISE: cache replacement and utility-aware scheduling

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems

Phase-based scheduling and thread migration for heterogeneous multicore processors

Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Understanding fundamental design choices in single-ISA heterogeneous multicore architectures

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Composite Cores: Pushing Heterogeneity Into a Core

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Utility-based acceleration of multithreaded applications on asymmetric CMPs

Proceedings of the 40th Annual International Symposium on Computer Architecture
Hierarchical power management for asymmetric multi-core in dark silicon era

Proceedings of the 50th Annual Design Automation Conference
An opportunistic prediction-based thread scheduling to maximize throughput/watt in AMPs

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
A unified view of non-monotonic core selection and application steering in heterogeneous chip multiprocessors

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Fairness-aware scheduling on single-ISA heterogeneous multi-cores

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
When slower is faster: on heterogeneous multicores for reliable systems

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Trace based phase prediction for tightly-coupled heterogeneous cores

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Price theory based power management for heterogeneous multi-cores

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Power-performance modeling on asymmetric multi-cores

Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Energy-aware thread co-location in heterogeneous multicore processors

Proceedings of the Eleventh ACM International Conference on Embedded Software
QoS-Aware scheduling in heterogeneous datacenters with paragon

ACM Transactions on Computer Systems (TOCS)
AMRC: an algebraic model for reconfiguration of high performance cluster computing systems at runtime

The Journal of Supercomputing
Trace alignment algorithms for offline workload analysis of heterogeneous architectures

Proceedings of the International Conference on Computer-Aided Design
ad-heap: an Efficient Heap Data Structure for Asymmetric Multicore Processors

Proceedings of Workshop on General Purpose Processing Using GPUs
Adaptive workload-aware task scheduling for single-ISA asymmetric multicore architectures

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Single-ISA heterogeneous multi-core processors are typically composed of small (e.g., in-order) power-efficient cores and big (e.g., out-of-order) high-performance cores. The effectiveness of heterogeneous multi-cores depends on how well a scheduler can map workloads onto the most appropriate core type. In general, small cores can achieve good performance if the workload inherently has high levels of ILP. On the other hand, big cores provide good performance if the workload exhibits high levels of MLP or requires the ILP to be extracted dynamically. This paper proposes Performance Impact Estimation (PIE) as a mechanism to predict which workload-to-core mapping is likely to provide the best performance. PIE collects CPI stack, MLP and ILP profile information, and estimates performance if the workload were to run on a different core type. Dynamic PIE adjusts the scheduling at runtime and thereby exploits fine-grained time-varying execution behavior. We show that PIE requires limited hardware support and can improve system performance by an average of 5.5% over recent state-of-the-art scheduling proposals and by 8.7% over a sampling-based scheduling policy.