System-level integrated server architectures for scale-out datacenters

  • Authors:
  • Sheng Li;Kevin Lim;Paolo Faraboschi;Jichuan Chang;Parthasarathy Ranganathan;Norman P. Jouppi

  • Affiliations:
  • Hewlett-Packard Labs;Hewlett-Packard Labs;Hewlett-Packard Labs;Hewlett-Packard Labs;Hewlett-Packard Labs;Hewlett-Packard Labs

  • Venue:
  • Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A System-on-Chip (SoC) integrates multiple discrete components into a single chip, for example by placing CPU cores, network interfaces and I/O controllers on the same die. While SoCs have dominated high-end embedded products for over a decade, system-level integration is a relatively new trend in servers, and is driven by the opportunity to lower cost (by reducing the number of discrete parts) and power (by reducing the pin crossings from the cores to the I/O). Today, the mounting cost pressures in scale-out dat-acenters demand technologies that can decrease the Total Cost of Ownership (TCO). At the same time, the diminshing return of dedicating the increasing number of available transistors to more cores and caches is creating a stronger case for SoC-based servers. This paper examines system-level integration design options for the scale-out server market, specifically targeting datacenter-scale throughput computing workloads. We develop tools to model the area and power of a variety of discrete and integrated server configurations. We evaluate the benefits, trade-offs, and trends of system-level integration for warehouse-scale datacenter servers, and identify the key "uncore" components that reduce cost and power. We perform a comprehensive design space exploration at both SoC and datacenter level, identify the sweet spots, and highlight important scaling trends of performance, power, area, and cost from 45nm to 16nm. Our results show that system integration yields substantial benefits, enables novel aggregated configurations with a much higher compute density, and significantly reduces total chip area and dynamic power versus a discrete-component server. Finally, we use utilization traces and architectural profiles of real machines to evaluate the dynamic power consumption of typical scale-out cloud applications, and combine them in an overall TCO model. Our results show that, for example at 16nm, SoC-based servers can achieve more than a 26% TCO reduction at datacenter scale.