Algorithms for clustering data
Algorithms for clustering data
ACM Transactions on Computer Systems (TOCS)
A blueprint for introducing disruptive technology into the Internet
ACM SIGCOMM Computer Communication Review
Grid Information Services for Distributed Resource Sharing
HPDC '01 Proceedings of the 10th IEEE International Symposium on High Performance Distributed Computing
Adaptive stream resource management using Kalman Filters
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Mercury: supporting scalable multi-attribute range queries
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable distributed information management system
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Ensembles of Models for Automated Diagnosis of System Performance Problems
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
NodeWiz: peer-to-peer resource discovery for grids
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid - Volume 01
Autonomic Computing
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
The Minimum Description Length Principle (Adaptive Computation and Machine Learning)
Spatial correlation-based collaborative medium access control in wireless sensor networks
IEEE/ACM Transactions on Networking (TON)
Using queries for distributed monitoring and forensics
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Flight data recorder: monitoring persistent-state interactions to improve systems management
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
AjaxScope: a platform for remotely monitoring the client-side behavior of web 2.0 applications
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Querying the internet with PIER
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Design and implementation tradeoffs for wide-area resource discovery
HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
STAR: self-tuning aggregation for scalable monitoring
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Resource Bundles: Using Aggregation for Statistical Wide-Area Resource Discovery and Allocation
ICDCS '08 Proceedings of the 2008 The 28th International Conference on Distributed Computing Systems
Toward Predictive Failure Management for Distributed Stream Processing Systems
ICDCS '08 Proceedings of the 2008 The 28th International Conference on Distributed Computing Systems
Context-aware reconfiguration of autonomic managers in real-time control applications
Proceedings of the 7th international conference on Autonomic computing
On the use of computational geometry to detect software faults at runtime
Proceedings of the 7th international conference on Autonomic computing
OLIC: online information compression for scalable hosting infrastructure monitoring
Proceedings of the Nineteenth International Workshop on Quality of Service
Performance troubleshooting in data centers: an annotated bibliography?
ACM SIGOPS Operating Systems Review
Hi-index | 0.00 |
Automatic management of large-scale production systems requires a continuous monitoring service to keep track of the states of the managed system. However, it is challenging to achieve both scalability and high information precision while continuously monitoring a large amount of distributed and time-varying metrics in large-scale production systems. In this paper, we present a new self-correlating, predictive information tracking system called InfoTrack, which employs lightweight temporal and spatial correlation discovery methods to minimize continuous monitoring cost. InfoTrack combines both metric value prediction within individual nodes and adaptive clustering among distributed nodes to suppress remote information update in distributed system monitoring. We have implemented a prototype of the InfoTrack system and deployed the system on the PlanetLab. We evaluated the performance of the InfoTrack system using both real system traces and micro-benchmark prototype experiments. The experimental results show that InfoTrack can reduce the continuous monitoring cost by 50-90% while maintaining high information precision (i.e., within 0.01-0.05 error bound).