Low latency energy efficient communications in global-scale cloud computing systems

  • Authors:
  • Ted H. Szymanski

  • Affiliations:
  • McMaster University, Hamilton, ON, Canada

  • Venue:
  • Proceedings of the 2013 workshop on Energy efficient high performance parallel and distributed computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper explores technologies to achieve low-latency energy-efficient communications in Global-Scale Cloud Computing systems. A global-scale cloud computing system linking 100 remote data-centers can interconnect potentially 5M servers, considerably larger than the size of traditional High-Performance-Computing (HPC) machines. Traditional HPC machines use tightly coupled processors and networks which rarely drop packets. In contrast, today's IP Internet is a relatively loosely-coupled Best-Effort network with poor latency and energy-efficiency guarantees, with relatively high packet loss rates. This paper explores the use of a recently-proposed Future-Internet network, which uses a QoS-aware router scheduling algorithm combined with a new IETF resource reservation signalling technology, to achieve improved latency and energy-efficiency in cloud computing systems. A Maximum-Flow Minimum-Energy routing algorithm is used to route high-capacity "trunks" between data-centers distributed over the continental USA, using a USA IP network topology. The communications between virtual machines in remote data-centers are aggregated and multiplexed onto the trunks, to achieve significantly improved energy-efficiency. According to theory and simulations, the large and variable queueing delays of traditional Best-Effort Internet links can be eliminated, and the latency over the cloud can be reduced to near-minimal values, i.e., the fiber latency. The maximum fiber latencies over the Sprint USA network are approx. 20 milliseconds, comparable to hard disk drive latencies, and multithreading in virtual machines can be used to hide these latencies. Furthermore, if existing dark-fiber over the continental network is activated, the bisection bandwidth available in a global-scale cloud computing system can rival that achievable in commercial HPC machines.