DreamWeaver: architectural support for deep sleep

  • Authors:
  • David Meisner;Thomas F. Wenisch

  • Affiliations:
  • University of Michigan, Ann Arbor, MI, USA;University of Michigan, Ann Arbor, MI, USA

  • Venue:
  • ASPLOS XVII Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Numerous data center services exhibit low average utilization leading to poor energy efficiency. Although CPU voltage and frequency scaling historically has been an effective means to scale down power with utilization, transistor scaling trends are limiting its effectiveness and the CPU is accounting for a shrinking fraction of system power. Recent research advocates the use of full-system idle low-power modes to combat energy losses, as such modes provide the deepest power savings with bounded response time impact. However, the trend towards increasing cores per die is undermining the effectiveness of these sleep modes, particularly for request-parallel data center applications, because the independent idle periods across individual cores are unlikely to align by happenstance. We propose DreamWeaver, architectural support to facilitate deep sleep for request-parallel applications on multicore servers. DreamWeaver comprises two elements: Weave Scheduling, a scheduling policy to coalesce idle and busy periods across cores to create opportunities for system-wide deep sleep; and the Dream Processor, a light-weight co-processor that monitors incoming network traffic and suspended work during sleep to determine when the system must wake. DreamWeaver is based on two key concepts: (1) stall execution and sleep anytime any core is unoccupied, but (2) constrain the maximum time any request may be stalled. Unlike prior scheduling approaches, DreamWeaver will preempt execution to sleep, maximizing time spent at the systems' most efficient operating point. We demonstrate that DreamWeaver can smoothly trade-off bounded, predictable increases in 99th-percentile response time for increasing power savings, and strictly dominates the savings available with voltage and frequency scaling and timeout-based request batching schemes.