Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Scheduler-conscious synchronization
ACM Transactions on Computer Systems (TOCS)
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Proceedings of the ACM 2000 conference on Java Grande
Variability in the execution of multimedia applications and implications for architecture
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Critical power slope: understanding the runtime effects of frequency scaling
ICS '02 Proceedings of the 16th international conference on Supercomputing
Performance directed energy management for main memory and disks
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
User-perceived latency driven voltage scaling for interactive applications
Proceedings of the 42nd annual Design Automation Conference
Power-performance considerations of parallel computing on chip multiprocessors
ACM Transactions on Architecture and Code Optimization (TACO)
Managing energy-performance tradeoffs for multithreaded applications on multiprocessor architectures
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Queue - Power Management
PowerNap: eliminating server idle power
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Making Augmented Reality Practical on Mobile Phones, Part 1
IEEE Computer Graphics and Applications
Optimal power allocation in server farms
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Over-provisioned multicore systems
Over-provisioned multicore systems
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
SD-VBS: The San Diego Vision Benchmark Suite
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Decoupling contention management from scheduling
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Characterizing processor thermal behavior
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Conservation cores: reducing the energy of mature computations
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis
Proceedings of the 37th annual international symposium on Computer architecture
Evolution of thread-level parallelism in desktop applications
Proceedings of the 37th annual international symposium on Computer architecture
Hybrid electrical energy storage systems
Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design
Communications of the ACM
Dark silicon and the end of multicore scaling
Proceedings of the 38th annual international symposium on Computer architecture
Dimetrodon: processor-level preventive thermal management via idle cycle injection
Proceedings of the 48th Design Automation Conference
Toward Dark Silicon in Servers
IEEE Micro
Race to idle: new algorithms for speed scaling with a sleep state
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Is dark silicon useful?: harnessing the four horsemen of the coming dark silicon apocalypse
Proceedings of the 49th Annual Design Automation Conference
MEVBench: A mobile computer vision benchmarking suite
IISWC '11 Proceedings of the 2011 IEEE International Symposium on Workload Characterization
Don't burn your mobile!: safe computational re-sprinting via model predictive control
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Hi-index | 0.00 |
CMOS scaling trends have led to an inflection point where thermal constraints (especially in mobile devices that employ only passive cooling) preclude sustained operation of all transistors on a chip --- a phenomenon called "dark silicon." Recent research proposed computational sprinting --- exceeding sustainable thermal limits for short intervals --- to improve responsiveness in light of the bursty computation demands of many media-rich interactive mobile applications. Computational sprinting improves responsiveness by activating reserve cores (parallel sprinting) and/or boosting frequency/voltage (frequency sprinting) to power levels that far exceed the system's sustainable cooling capabilities, relying on thermal capacitance to buffer heat. Prior work analyzed the feasibility of sprinting through modeling and simulation. In this work, we investigate sprinting using a hardware/software testbed. First, we study unabridged sprints, wherein the computation completes before temperature becomes critical, demonstrating a 6.3x responsiveness gain, and a 6% energy efficiency improvement by racing to idle. We then analyze truncated sprints, wherein our software runtime system must intervene to prevent overheating by throttling parallelism and frequency before the computation is complete. To avoid oversubscription penalties (context switching inefficiencies after a truncated parallel sprint), we develop a sprint-aware task-based parallel runtime. We find that maximal-intensity sprinting is not always best, introduce the concept of sprint pacing, and evaluate an adaptive policy for selecting sprint intensity. We report initial results using a phase change heat sink to extend maximum sprint duration. Finally, we demonstrate that a sprint-and-rest operating regime can actually outperform thermally-limited sustained execution.