Computational sprinting on a hardware/software testbed

Authors:
Arun Raghavan;Laurel Emurian;Lei Shao;Marios Papaefthymiou;Kevin P. Pipe;Thomas F. Wenisch;Milo M.K. Martin
Affiliations:
University of Pennsylvania, Philadelphia, PA, USA;University of Pennsylvania, Philadelphia, PA, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA;University of Pennsylvania, Philadelphia, PA, USA
Venue:
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Year:
2013

Citing 34
Cited 0

Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Scheduler-conscious synchronization

ACM Transactions on Computer Systems (TOCS)
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
A Java fork/join framework

Proceedings of the ACM 2000 conference on Java Grande
Variability in the execution of multimedia applications and implications for architecture

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Critical power slope: understanding the runtime effects of frequency scaling

ICS '02 Proceedings of the 16th international conference on Supercomputing
Performance directed energy management for main memory and disks

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
User-perceived latency driven voltage scaling for interactive applications

Proceedings of the 42nd annual Design Automation Conference
Power-performance considerations of parallel computing on chip multiprocessors

ACM Transactions on Architecture and Code Optimization (TACO)
Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Powering Down

Queue - Power Management
PowerNap: eliminating server idle power

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Making Augmented Reality Practical on Mobile Phones, Part 1

IEEE Computer Graphics and Applications
Optimal power allocation in server farms

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Over-provisioned multicore systems

Over-provisioned multicore systems
PACER: toward a cameraphone-based paper interface for fine-grained and flexible interaction with documents

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Into the wild: studying real user activity patterns to guide power optimizations for mobile architectures

Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
SD-VBS: The San Diego Vision Benchmark Suite

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Decoupling contention management from scheduling

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Characterizing processor thermal behavior

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Conservation cores: reducing the energy of mature computations

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis

Proceedings of the 37th annual international symposium on Computer architecture
Evolution of thread-level parallelism in desktop applications

Proceedings of the 37th annual international symposium on Computer architecture
Hybrid electrical energy storage systems

Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Computing Performance: Game Over or Next Level?

Computer
The future of microprocessors

Communications of the ACM
Dark silicon and the end of multicore scaling

Proceedings of the 38th annual international symposium on Computer architecture
Dimetrodon: processor-level preventive thermal management via idle cycle injection

Proceedings of the 48th Design Automation Conference
Toward Dark Silicon in Servers

IEEE Micro
Race to idle: new algorithms for speed scaling with a sleep state

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Computational sprinting

HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Is dark silicon useful?: harnessing the four horsemen of the coming dark silicon apocalypse

Proceedings of the 49th Annual Design Automation Conference
MEVBench: A mobile computer vision benchmarking suite

IISWC '11 Proceedings of the 2011 IEEE International Symposium on Workload Characterization
Don't burn your mobile!: safe computational re-sprinting via model predictive control

Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

Quantified Score

Hi-index	0.00

Visualization

Abstract

CMOS scaling trends have led to an inflection point where thermal constraints (especially in mobile devices that employ only passive cooling) preclude sustained operation of all transistors on a chip --- a phenomenon called "dark silicon." Recent research proposed computational sprinting --- exceeding sustainable thermal limits for short intervals --- to improve responsiveness in light of the bursty computation demands of many media-rich interactive mobile applications. Computational sprinting improves responsiveness by activating reserve cores (parallel sprinting) and/or boosting frequency/voltage (frequency sprinting) to power levels that far exceed the system's sustainable cooling capabilities, relying on thermal capacitance to buffer heat. Prior work analyzed the feasibility of sprinting through modeling and simulation. In this work, we investigate sprinting using a hardware/software testbed. First, we study unabridged sprints, wherein the computation completes before temperature becomes critical, demonstrating a 6.3x responsiveness gain, and a 6% energy efficiency improvement by racing to idle. We then analyze truncated sprints, wherein our software runtime system must intervene to prevent overheating by throttling parallelism and frequency before the computation is complete. To avoid oversubscription penalties (context switching inefficiencies after a truncated parallel sprint), we develop a sprint-aware task-based parallel runtime. We find that maximal-intensity sprinting is not always best, introduce the concept of sprint pacing, and evaluate an adaptive policy for selecting sprint intensity. We report initial results using a phase change heat sink to extend maximum sprint duration. Finally, we demonstrate that a sprint-and-rest operating regime can actually outperform thermally-limited sustained execution.