PeerMon: a peer-to-peer network monitoring system

  • Authors:
  • Tia Newhall;Janis Libeks;Ross Greenwood;Jeff Knerr

  • Affiliations:
  • Swarthmore College Computer Science Department, Swarthmore, PA;Swarthmore College Computer Science Department, Swarthmore, PA;Swarthmore College Computer Science Department, Swarthmore, PA;Swarthmore College Computer Science Department, Swarthmore, PA

  • Venue:
  • LISA'10 Proceedings of the 24th international conference on Large installation system administration
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present PeerMon, a peer-to-peer resource monitoring system for general purpose Unix local area network (LAN) systems. PeerMon is designed to monitor system resources on a single LAN, but it also could be deployed on several LANs where some inter-LAN resource sharing is supported. Its peer-to-peer design makes Peer-Mon a scalable and fault tolerant monitoring system for efficiently collecting system-wide resource usage information. Experiments evaluating PeerMon's performance show that it adds little additional overhead to the system and that it scales well to large-sized LANs. Peer-Mon was initially designed to be used by system services that provide load balancing and job placement, however, it can be easily extended to providemonitoring data for other system-wide services. We present three tools (smarterSSH, autoMPIgen, and a dynamic DNS binding system) that use PeerMon data to pick "good" nodes for job or process placement in a LAN. Tools using PeerMon data for job placement can greatly improve the performance of applications running on general purpose LANs. We present results showing application speed-ups of up to 4.6 using our tools.