OpenDHT: a public DHT service and its uses
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Proper: privileged operations in a virtualised system environment
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Reliability and security in the CoDeeN content distribution network
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Experience-driven experimental systems research
Communications of the ACM
CAMP: a common API for measuring performance
LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
Remote control: distributed application configuration, management, and visualization with plush
LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
Everlab: a production platform for research in network experimentation and computation
LISA'07 Proceedings of the 21st conference on Large Installation System Administration Conference
UsenetDHT: a low-overhead design for Usenet
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
Co-designing the failure analysis and monitoring of large-scale systems
ACM SIGMETRICS Performance Evaluation Review
Moara: flexible and scalable group-based querying system
Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Bringing big systems to small schools: distributed systems for undergraduates
Proceedings of the 40th ACM technical symposium on Computer science education
What's inside the Cloud? An architectural map of the Cloud landscape
CLOUD '09 Proceedings of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing
Liana: a decentralized load-dependent scheduler for performance-cost optimization of grid service
The Journal of Supercomputing
Rhizoma: a runtime for self-deploying, self-managing overlays
Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware
An architecture for network management
Proceedings of the 2009 workshop on Re-architecting the internet
Using economic regulation to prevent resource congestion in large-scale shared infrastructures
Future Generation Computer Systems
Towards a high quality path-oriented network measurement and storage system
PAM'08 Proceedings of the 9th international conference on Passive and active network measurement
Improving wide-area distributed system availability
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 2
Rhizoma: a runtime for self-deploying, self-managing overlays
Middleware'09 Proceedings of the ACM/IFIP/USENIX 10th international conference on Middleware
RiaS: overlay topology creation on a PlanetLab infrastructure
Proceedings of the second ACM SIGCOMM workshop on Virtualized infrastructure systems and architectures
Colocation games: and their application to distributed resource management
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
Dependable self-hosting distributed systems using constraints
HotDep'08 Proceedings of the Fourth conference on Hot topics in system dependability
Lightweight, high-resolution monitoring for troubleshooting production systems
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
StrobeLight: lightweight availability mapping and anomaly detection
USENIX'09 Proceedings of the 2009 conference on USENIX Annual technical conference
Federation of virtualized infrastructures: sharing the value of diversity
Proceedings of the 6th International COnference
Explaining packet delays under virtualization
ACM SIGCOMM Computer Communication Review
German-lab experimental facility
FIS'10 Proceedings of the Third future internet conference on Future internet
The flexlab approach to realistic evaluation of networked systems
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Modeling resource usage in planetary-scale shared infrastructures: PlanetLab's case study
Computer Networks: The International Journal of Computer and Telecommunications Networking
Distributed application configuration, management, and visualization with plush
ACM Transactions on Internet Technology (TOIT)
Understanding and characterizing PlanetLab resource usage for federated network testbeds
Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
SLA-based resource provisioning for heterogeneous workloads in a virtualized cloud datacenter
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part I
Queuing network of scale free topology: on modelling large scale network
The Journal of Supercomputing
Using metadata to improve experiment reliability in shared environments
TMA'12 Proceedings of the 4th international conference on Traffic Monitoring and Analysis
Concurrency and Computation: Practice & Experience
Quality architecture for resource allocation in cloud computing
ESOCC'12 Proceedings of the First European conference on Service-Oriented and Cloud Computing
Monitoring service choreographies from multiple sources
SERENE'12 Proceedings of the 4th international conference on Software Engineering for Resilient Systems
Fmeter: extracting indexable low-level system signatures by counting kernel function calls
Proceedings of the 13th International Middleware Conference
A monitoring system for community-lab
Proceedings of the 11th ACM international symposium on Mobility management and wireless access
OSM: Prioritized evolutionary QoS optimization for interactive 3D teleimmersion
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special issue of best papers of ACM MMSys 2013 and ACM NOSSDAV 2013
Performance troubleshooting in data centers: an annotated bibliography?
ACM SIGOPS Operating Systems Review
Decentralized monitoring in peer-to-peer systems
Benchmarking Peer-to-Peer Systems
Energy Aware Consolidation Algorithm Based on K-Nearest Neighbor Regression for Cloud Data Centers
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Hi-index | 0.02 |
CoMon is an evolving, mostly-scalable monitoring system for PlanetLab that has the goal of presenting environment-tailored information for both the administrators and users of the PlanetLab global testbed. In addition to passively reporting metrics provided by the operating system, CoMon also actively gathers a number of metrics useful for developers of networked systems. Using CoMon, PlanetLab administrators and users can easily spot problematic machines, where the problem may arise from the machine itself, local configuration/environment problems, or the workload running on the machine. Furthermore, users can easily observe many properties of all of the experiments running across multiple PlanetLab nodes, facilitating not only their own experiment monitoring and debugging, but also helping scale the task of finding PlanetLab problems.In this paper we describe CoMon's design and operation, including what kinds of data are gathered, the scale of the processing involved, and the approaches we have taken to keep CoMon running. Our goal is not only to illustrate the kinds of problems faced in this environment, but also to invite others to participate, either by experimenting with the data generated by CoMon, or by building on the CoMon system itself.