The Road Ahead: The significance of packaging
IEEE Design & Test
Active Storage for Large-Scale Data Mining and Multimedia
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The M5 Simulator: Modeling Networked Systems
IEEE Micro
PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
An 8-core, 64-thread, 64-bit power efficient sparc soc (niagara2)
Proceedings of the 2007 international symposium on Physical design
Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Gordon: using flash memory to build fast, power-efficient clusters for data-intensive applications
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Achieving 10 Gb/s using safe and transparent network interface virtualization
Proceedings of the 2009 ACM SIGPLAN/SIGOPS international conference on Virtual execution environments
Microarchitecture in the system-level integration era
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
FAWN: a fast array of wimpy nodes
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Digital Integrated Circuit Design: From VLSI Architectures to CMOS Fabrication
Digital Integrated Circuit Design: From VLSI Architectures to CMOS Fabrication
Proceedings of the International Conference on Computer-Aided Design
Supercomputing with commodity CPUs: are mobile SoCs ready for HPC?
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Enabling datacenter servers to scale out economically and sustainably
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Market mechanisms for managing datacenters with heterogeneous microarchitectures
ACM Transactions on Computer Systems (TOCS)
Rhythm: harnessing data parallel hardware for server workloads
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
A System-on-Chip (SoC) integrates multiple discrete components into a single chip, for example by placing CPU cores, network interfaces and I/O controllers on the same die. While SoCs have dominated high-end embedded products for over a decade, system-level integration is a relatively new trend in servers, and is driven by the opportunity to lower cost (by reducing the number of discrete parts) and power (by reducing the pin crossings from the cores to the I/O). Today, the mounting cost pressures in scale-out dat-acenters demand technologies that can decrease the Total Cost of Ownership (TCO). At the same time, the diminshing return of dedicating the increasing number of available transistors to more cores and caches is creating a stronger case for SoC-based servers. This paper examines system-level integration design options for the scale-out server market, specifically targeting datacenter-scale throughput computing workloads. We develop tools to model the area and power of a variety of discrete and integrated server configurations. We evaluate the benefits, trade-offs, and trends of system-level integration for warehouse-scale datacenter servers, and identify the key "uncore" components that reduce cost and power. We perform a comprehensive design space exploration at both SoC and datacenter level, identify the sweet spots, and highlight important scaling trends of performance, power, area, and cost from 45nm to 16nm. Our results show that system integration yields substantial benefits, enables novel aggregated configurations with a much higher compute density, and significantly reduces total chip area and dynamic power versus a discrete-component server. Finally, we use utilization traces and architectural profiles of real machines to evaluate the dynamic power consumption of typical scale-out cloud applications, and combine them in an overall TCO model. Our results show that, for example at 16nm, SoC-based servers can achieve more than a 26% TCO reduction at datacenter scale.