STAR: self-tuning aggregation for scalable monitoring

  • Authors:
  • Navendu Jain;Dmitry Kit;Prince Mahajan;Praveen Yalagandula;Mike Dahlin;Yin Zhang

  • Affiliations:
  • University of Texas at Austin;University of Texas at Austin;University of Texas at Austin;Hewlett-Packard Labs, Palo Alto, CA;University of Texas at Austin;University of Texas at Austin

  • Venue:
  • VLDB '07 Proceedings of the 33rd international conference on Very large data bases
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present STAR, a self-tuning algorithm that adaptively sets numeric precision constraints to accurately and efficiently answer continuous aggregate queries over distributed data streams. Adaptivity and approximation are essential for both robustness to varying workload characteristics and for scalability to large systems. In contrast to previous studies, we treat the problem as a workload-aware optimization problem whose goal is to minimize the total communication load for a multi-level aggregation tree under a fixed error budget. STAR's hierarchical algorithm takes into account the update rate and variance in the input data distribution in a principled manner to compute an optimal error distribution, and it performs cost-benefit throttling to direct error slack to where it yields the largest benefits. Our prototype implementation of STAR in a large-scale monitoring system provides (1) a new distribution mechanism that enables self-tuning error distribution and (2) an optimization to reduce communication overhead in a practical setting by carefully distributing the initial, default error budgets. Through extensive simulations and experiments on a real network monitoring implementation, we show that STAR achieves significant performance benefits compared to existing approaches while still providing high accuracy and incurring low overheads.