Communication-efficient distributed monitoring of thresholded counts

  • Authors:
  • Ram Keralapura;Graham Cormode;Jeyashankher Ramamirtham

  • Affiliations:
  • UC Davis;Bell Labs;Bell Labs

  • Venue:
  • Proceedings of the 2006 ACM SIGMOD international conference on Management of data
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Monitoring is an issue of primary concern in current and next generation networked systems. For ex, the objective of sensor networks is to monitor their surroundings for a variety of different applications like atmospheric conditions, wildlife behavior, and troop movements among others. Similarly, monitoring in data networks is critical not only for accounting and management, but also for detecting anomalies and attacks. Such monitoring applications are inherently continuous and distributed, and must be designed to minimize the communication overhead that they introduce. In this context we introduce and study a fundamental class of problems called "thresholded counts" where we must return the aggregate frequency count of an event that is continuously monitored by distributed nodes with a user-specified accuracy whenever the actual count exceeds a given threshold value.In this paper we propose to address the problem of thresholded counts by setting local thresholds at each monitoring node and initiating communication only when the locally observed data exceeds these local thresholds. We explore algorithms in two categories: static and adaptive thresholds. In the static case, we consider thresholds based on a linear combination of two alternate strategies, and show that there exists an optimal blend of the two strategies that results in minimum communication overhead. We further show that this optimal blend can be found using a steepest descent search. In the adaptive case, we propose algorithms that adjust the local thresholds based on the observed distributions of updated information. We use extensive simulations not only to verify the accuracy of our algorithms and validate our theoretical results, but also to evaluate the performance of our algorithms. We find that both approaches yield significant savings over the naive approach of centralized processing.