Reliable on-demand management operations for large-scale distributed applications

  • Authors:
  • Jin Liang;Indranil Gupta;Klara Nahrstedt

  • Affiliations:
  • University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign;University of Illinois at Urbana-Champaign

  • Venue:
  • ACM SIGOPS Operating Systems Review - Gossip-based computer networking
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper argues for attention to, and proposes a novel direction to solving, instant monitoring and management tasks for large-scale distributed applications running across hundreds of hosts. We present the MON (Management Overlay Networks) approach1, which uses a novel concept called on-demand overlays, in order to support instant commands such as queries and software pushes. On-demand overlays are built on-the-fly and probabilistically, by leveraging weakly-consistent gossip-style membership information underneath. Thus, they are lightweight in terms of memory, computation, and bandwidth. We augment on-demand overlays with several notions of application-specified reliability, and show how MON detects and adheres to these. MON is available atop PlanetLab, and we present experimental results. We conclude with a series of promising open problems in this direction.