Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Passive network tomography using Bayesian inference
Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
A technique for counting natted hosts
Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Detection of Invalid Routing Announcement in the Internet
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Origin authentication in interdomain routing
Proceedings of the 10th ACM conference on Computer and communications security
SPV: secure path vector routing for securing BGP
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Locating BGP missing routes using multiple perspectives
Proceedings of the ACM SIGCOMM workshop on Network troubleshooting: research, theory and operations practice meet malfunctioning reality
An empirical study of spam traffic and the use of DNS black lists
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
BOINC: A System for Public-Resource Computing and Storage
GRID '04 Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing
Farsite: federated, available, and reliable storage for an incompletely trusted environment
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
BGP routing changes: merging views from two ISPs
ACM SIGCOMM Computer Communication Review
CoMon: a mostly-scalable monitoring system for PlanetLab
ACM SIGOPS Operating Systems Review
Understanding the network-level behavior of spammers
Proceedings of the 2006 conference on Applications, technologies, architectures, and protocols for computer communications
Optimizing BGP security by exploiting path stability
Proceedings of the 13th ACM conference on Computer and communications security
The Farsite project: a retrospective
ACM SIGOPS Operating Systems Review - Systems work at Microsoft Research
Listen and whisper: security mechanisms for BGP
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Total recall: system support for automated availability management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Using PlanetLab for network research: myths, realities, and best practices
WORLDS'05 Proceedings of the 2nd conference on Real, Large Distributed Systems - Volume 2
Concilium: Collaborative Diagnosis of Broken Overlay Routes
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Accurate Real-time Identification of IP Prefix Hijacking
SP '07 Proceedings of the 2007 IEEE Symposium on Security and Privacy
PHAS: a prefix hijack alert system
USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
Exploiting availability prediction in distributed systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
A study of prefix hijacking and interception in the internet
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
A light-weight distributed scheme for detecting ip prefix hijacks in real-time
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
iPlane: an information plane for distributed services
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Querying the internet with PIER
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Hi-index | 0.00 |
Large-scale distributed systems span thousands of independent hosts which come online and go offline at their users' whim. Such availability flux is ostensibly a key concern when systems are designed, but this flux is rarely measured in a rich way post-deployment, either by the distributed system itself or by a standalone piece of infrastructure. In this paper we introduce StrobeLight, a tool for monitoring per-host availability trends in enterprise settings. Every 30 seconds, StrobeLight probes Microsoft's entire corporate network, archiving the ping results for use by other networked services. We describe two such services, one offline and the other online. The first service uses longitudinal data collected by our StrobeLight deployment to analyze large-scale trends in our wired and wireless networks. The second service draws live StrobeLight measurements to detect network anomalies like IP hijacking in real time. StrobeLight is easy to deploy, requiring neither modification to end hosts nor changes to the core routing infrastructure. Furthermore, it requires minimal network and CPU resources to probe our network of over 200,000 hosts.