STAR: self-tuning aggregation for scalable monitoring

Authors:
Navendu Jain;Dmitry Kit;Prince Mahajan;Praveen Yalagandula;Mike Dahlin;Yin Zhang
Affiliations:
University of Texas at Austin;University of Texas at Austin;University of Texas at Austin;Hewlett-Packard Labs, Palo Alto, CA;University of Texas at Austin;University of Texas at Austin
Venue:
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Year:
2007

Citing 34
Cited 17

Temporal notions of synchronization and consistency in Beehive

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Accessing nearby copies of replicated objects in a distributed environment

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A robust, optimization-based approach for approximate answering of aggregate queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
New directions in traffic measurement and accounting

ACM SIGCOMM Computer Communication Review
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Design and evaluation of a conit-based continuous consistency model for replicated services

ACM Transactions on Computer Systems (TOCS)
Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Serving DNS Using a Peer-to-Peer Lookup Service

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining

ACM Transactions on Computer Systems (TOCS)
A knowledge plane for the internet

Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Distributed top-k monitoring

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Cache-and-query for wide area sensor databases

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive filters for continuous queries over distributed data streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and

Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and
SHARP: an architecture for secure resource peering

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Sophia: an Information Plane for networked systems

ACM SIGCOMM Computer Communication Review
Adaptive stream resource management using Kalman Filters

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adaptive ordering of pipelined stream filters

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Mercury: supporting scalable multi-attribute range queries

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable distributed information management system

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Predictive filtering: a learning-based approach to data stream filtering

DMSN '04 Proceeedings of the 1st international workshop on Data management for sensor networks: in conjunction with VLDB 2004
Finding (Recently) Frequent Items in Distributed Data Streams

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
TAG: a Tiny AGgregation service for Ad-Hoc sensor networks

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
An integrated experimental environment for distributed systems and networks

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Communication-efficient distributed monitoring of thresholded counts

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
S3: a scalable sensing service for monitoring large networked systems

Proceedings of the 2006 SIGCOMM workshop on Internet network management
SkipNet: a scalable overlay network with practical locality properties

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
Communication-Efficient Tracking of Distributed Cumulative Triggers

ICDCS '07 Proceedings of the 27th International Conference on Distributed Computing Systems
Querying the internet with PIER

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Moara: flexible and scalable group-based querying system

Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Efficient on-demand operations in dynamic distributed infrastructures

LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
A Partition-Based Broadcast Algorithm over DHT for Large-Scale Computing Infrastructures

GPC '09 Proceedings of the 4th International Conference on Advances in Grid and Pervasive Computing
Self-correlating predictive information tracking for large-scale production systems

ICAC '09 Proceedings of the 6th international conference on Autonomic computing
Towards efficient event aggregation in a decentralized publish-subscribe system

Proceedings of the Third ACM International Conference on Distributed Event-Based Systems
DHT-based lightweight broadcast algorithms in large-scale computing infrastructures

Future Generation Computer Systems
Processing continuous join queries in sensor networks: a filtering approach

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Network imprecision: a new consistency metric for scalable monitoring

OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
The declarative imperative: experiences and conjectures in distributed logic

ACM SIGMOD Record
OLIC: online information compression for scalable hosting infrastructure monitoring

Proceedings of the Nineteenth International Workshop on Quality of Service
Continuous distributed monitoring: a short survey

Proceedings of the First International Workshop on Algorithms and Models for Distributed Event Processing
Bringing introspection into BlobSeer: Towards a self-adaptive distributed data management system

International Journal of Applied Mathematics and Computer Science - SPECIAL SECTION: Efficient Resource Management for Grid-Enabled Applications
Review: Reliable spatial window aggregation query processing algorithm in wireless sensor networks

Journal of Network and Computer Applications
Self-adaptive approximate queries for large-scale information aggregation

International Journal of Web and Grid Services
Aggregation for implicit invocations

Proceedings of the 12th annual international conference on Aspect-oriented software development
The continuous distributed monitoring model

ACM SIGMOD Record
Decentralized monitoring in peer-to-peer systems

Benchmarking Peer-to-Peer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present STAR, a self-tuning algorithm that adaptively sets numeric precision constraints to accurately and efficiently answer continuous aggregate queries over distributed data streams. Adaptivity and approximation are essential for both robustness to varying workload characteristics and for scalability to large systems. In contrast to previous studies, we treat the problem as a workload-aware optimization problem whose goal is to minimize the total communication load for a multi-level aggregation tree under a fixed error budget. STAR's hierarchical algorithm takes into account the update rate and variance in the input data distribution in a principled manner to compute an optimal error distribution, and it performs cost-benefit throttling to direct error slack to where it yields the largest benefits. Our prototype implementation of STAR in a large-scale monitoring system provides (1) a new distribution mechanism that enables self-tuning error distribution and (2) an optimization to reduce communication overhead in a practical setting by carefully distributing the initial, default error budgets. Through extensive simulations and experiments on a real network monitoring implementation, we show that STAR achieves significant performance benefits compared to existing approaches while still providing high accuracy and incurring low overheads.