Predictable Cloud Computing

  • Authors:
  • Sape J. Mullender

  • Affiliations:
  • Network Systems Lab, Bell Labs, Antwerp, Belgium

  • Venue:
  • Bell Labs Technical Journal
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The standard tools for cloud computing—processor and network virtualization—make it difficult to achieve dependability, both in terms of real time operations and fault tolerance. Virtualization multiplexes virtual resources onto physical ones, typically by time division or statistical multiplexing. Time, in the virtual machine, is therefore as virtual as the machine itself. And fault tolerance is difficult to achieve when redundancy and independent failure in the virtual environment do not necessarily map to those properties in the physical environment. Virtualization adds a level of indirection that creates overhead, and makes it all but impossible to achieve predictable performance. Osprey uses an alternative to virtualization that achieves the same goals of scalability and flexibility but carries neither the overhead of virtualization, nor the restrictions on dependability. The result is a programming environment that achieves most of the compatibility offered by traditional virtualization efforts and provides much better and much more predictable performance. One technique we use is called Library OS, which stems from high-performance computing. The technique consists of linking applications with a library that implements most services normally provided by the operating system, creating an application that can run practically stand alone, or at least with a very minimal operating system. The Library OS approach moves the boundary between application and operating system down to a level where interactions with the operating system consist of sending/receiving messages (e.g., network packets) and scheduling resources (processor, memory, network bandwidth, and device access). These interactions, as we demonstrate, form a relatively weak bond between an application and the particular instance of the operating system on which it runs—one that can be broken and re-established elsewhere. In fact, we make sure this is the case. Legacy applications that cannot be recompiled or relinked can make use of a Library OS server that runs as a tandem process along with the legacy application processes. System calls from the legacy process are catapulted into the Library OS server which executes them. Applications can still migrate, taking their server process along with them. © 2012 Alcatel-Lucent. © 2012 Wiley Periodicals, Inc.