CoMon: a mostly-scalable monitoring system for PlanetLab

  • Authors:
  • KyoungSoo Park;Vivek S. Pai

  • Affiliations:
  • Princeton University;Princeton University

  • Venue:
  • ACM SIGOPS Operating Systems Review
  • Year:
  • 2006

Quantified Score

Hi-index 0.02

Visualization

Abstract

CoMon is an evolving, mostly-scalable monitoring system for PlanetLab that has the goal of presenting environment-tailored information for both the administrators and users of the PlanetLab global testbed. In addition to passively reporting metrics provided by the operating system, CoMon also actively gathers a number of metrics useful for developers of networked systems. Using CoMon, PlanetLab administrators and users can easily spot problematic machines, where the problem may arise from the machine itself, local configuration/environment problems, or the workload running on the machine. Furthermore, users can easily observe many properties of all of the experiments running across multiple PlanetLab nodes, facilitating not only their own experiment monitoring and debugging, but also helping scale the task of finding PlanetLab problems.In this paper we describe CoMon's design and operation, including what kinds of data are gathered, the scale of the processing involved, and the approaches we have taken to keep CoMon running. Our goal is not only to illustrate the kinds of problems faced in this environment, but also to invite others to participate, either by experimenting with the data generated by CoMon, or by building on the CoMon system itself.