Maintenance of Monitoring Systems Throughout Self-healing Mechanisms
DSOM '08 Proceedings of the 19th IFIP/IEEE international workshop on Distributed Systems: Operations and Management: Managing Large-Scale Service Deployment
Monitoring Flow Aggregates with Controllable Accuracy
MMNS '07 Proceedings of the 10th IFIP/IEEE International Conference on Management of Multimedia and Mobile Networks and Services: Real-Time Mobile Multimedia Services
Decentralized real-time monitoring of network-wide aggregates
LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
Flow Monitoring in Wireless MESH Networks
AIMS '09 Proceedings of the 3rd International Conference on Autonomous Infrastructure, Management and Security: Scalability of Networks and Services
Distributed control of performance management traffic with accuracy objectives
International Journal of Network Management
Gossiping for threshold detection
IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
Controlling performance trade-offs in adaptive network monitoring
IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
Computing histograms of local variables for real-time monitoring using aggregation trees
IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
Adaptive real-time monitoring for large-scale networked systems
IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
Decentralized Aggregation Protocols in Peer-to-Peer Networks: A Survey
MACE '09 Proceedings of the 4th IEEE International Workshop on Modelling Autonomic Communications Environments
Monitoring, aggregation and filtering for efficient management of virtual networks
Proceedings of the 7th International Conference on Network and Services Management
A task routing approach to large-scale scheduling
Future Generation Computer Systems
Hi-index | 0.00 |
We present A-GAP, a novel protocol for continuous monitoring of network state variables, which aims at achieving a given monitoring accuracy with minimal overhead. Network state variables are computed from device counters using aggregation functions, such as SUM, AVERAGE and MAX. The accuracy objective is expressed as the average estimation error. A-GAP is decentralized and asynchronous to achieve robustness and scalability. It executes on an overlay that interconnects management processes on the devices. On this overlay, the protocol maintains a spanning tree and updates the network state variables through incremental aggregation. Based on a stochastic model, it dynamically configures local filters that control whether an update is sent towards the root of the tree. We evaluate A-GAP through simulation using real traces and two different types of topologies of up to 650 nodes. The results show that we can effectively control the trade-off between accuracy and protocol overhead, and that the overhead can be reduced by almost two orders of magnitude when allowing for small errors. The protocol quickly adapts to a node failure and exhibits short spikes in the estimation error. Lastly, it can provide an accurate estimate of the error distribution in real-time.