Optimizing Google's warehouse scale computers: The NUMA experience

  • Authors:
  • Lingjia Tang;Jason Mars;Xiao Zhang;Robert Hagmann;Robert Hundt;Eric Tune

  • Affiliations:
  • University of California, San Diego, USA;University of California, San Diego, USA;Google, USA;Google, USA;Google, USA;Google, USA

  • Venue:
  • HPCA '13 Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the complexity and the massive scale of modern warehouse scale computers (WSCs), it is challenging to quantify the performance impact of individual microarchitectural properties and the potential optimization benefits in the production environment. As a result of these challenges, there is currently a lack of understanding of the microarchitecture-workload interaction, leaving potentially significant performance on the table. This paper argues for a two-phase performance analysis methodology for optimizing WSCs that combines both an in-production investigation and an experimental load-testing study. To demonstrate the effectiveness of this two-phase approach, and to illustrate the challenges, methodologies and opportunities in optimizing modern WSCs, this paper investigates the impact of non-uniform memory access (NUMA) for several Google's key web-service workloads in large-scale production WSCs. Leveraging a newly-designed metric and continuous large-scale profiling in live datacenters, our production analysis demonstrates that NUMA has a significant impact (10–20%) on two important web-services: Gmail backend and web-search frontend. Our carefully designed load-test further reveals surprising tradeoffs between optimizing for NUMA performance and reducing cache contention.