Utility-based acceleration of multithreaded applications on asymmetric CMPs

Authors:
José A. Joao;M. Aater Suleman;Onur Mutlu;Yale N. Patt
Affiliations:
The University of Texas at Austin, Austin, TX;Flux7 Consulting, Austin, TX and The University of Texas at Austin, Austin, TX;Carnegie Mellon University, Pittsburgh, PA;The University of Texas at Austin, Austin, TX
Venue:
Proceedings of the 40th Annual International Symposium on Computer Architecture
Year:
2013

Citing 20
Cited 0

Scalable high speed IP routing lookups

SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
Symbiotic jobscheduling for a simultaneous multithreaded processor

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Mitigating Amdahl's Law through EPI Throttling

Proceedings of the 32nd annual international symposium on Computer Architecture
Performance, Power Efficiency and Scalability of Asymmetric Cluster Chip Multiprocessors

IEEE Computer Architecture Letters
Evaluating MapReduce for Multi-core and Multiprocessor Systems

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Stall-Time Fair Memory Access Scheduling for Chip Multiprocessors

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
System-Level Performance Metrics for Multiprogram Workloads

IEEE Micro
Meeting points: using thread criticality to adapt multicore hardware to parallel regions

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Accelerating critical section execution with asymmetric multi-core architectures

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Thread criticality predictors for dynamic performance, power, and resource management in chip multiprocessors

Proceedings of the 36th annual international symposium on Computer architecture
Age based scheduling for asymmetric multiprocessors

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Application-aware prioritization mechanisms for on-chip networks

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Bias scheduling in heterogeneous multi-core architectures

Proceedings of the 5th European conference on Computer systems
A comprehensive scheduler for asymmetric multicore systems

Proceedings of the 5th European conference on Computer systems
Data marshaling for multi-core architectures

Proceedings of the 37th annual international symposium on Computer architecture
Feedback-directed pipeline parallelism

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Scheduling Multiple Multithreaded Applications on Asymmetric and Symmetric Chip Multiprocessors

PAAP '10 Proceedings of the 2010 3rd International Symposium on Parallel Architectures, Algorithms and Programming
Bottleneck identification and scheduling in multithreaded applications

ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
Parallel application memory scheduling

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Scheduling heterogeneous multi-cores through Performance Impact Estimation (PIE)

Proceedings of the 39th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Asymmetric Chip Multiprocessors (ACMPs) are becoming a reality. ACMPs can speed up parallel applications if they can identify and accelerate code segments that are critical for performance. Proposals already exist for using coarse-grained thread scheduling and fine-grained bottleneck acceleration. Unfortunately, there have been no proposals offered thus far to decide which code segments to accelerate in cases where both coarse-grained thread scheduling and fine-grained bottleneck acceleration could have value. This paper proposes Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs (UBA), a cooperative software/hardware mechanism for identifying and accelerating the most likely critical code segments from a set of multithreaded applications running on an ACMP. The key idea is a new Utility of Acceleration metric that quantifies the performance benefit of accelerating a bottleneck or a thread by taking into account both the criticality and the expected speedup. UBA outperforms the best of two state-of-the-art mechanisms by 11% for single application workloads and by 7% for two-application workloads on an ACMP with 52 small cores and 3 large cores.