The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Approximate medians and other quantiles in one pass and with limited memory
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
Proceedings of the 27th International Conference on Very Large Data Bases
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Processing set expressions over continuous update streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
The design of an acquisitional query processor for sensor networks
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive filters for continuous queries over distributed data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Medians and beyond: new aggregation techniques for sensor networks
SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Power-conserving computation of order-statistics over sensor networks
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
How to summarize the universe: dynamic maintenance of quantiles
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Distributed set-expression cardinality estimation
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Sketching streams through the net: distributed approximate query tracking
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient gossip-based aggregate computation
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Communication-efficient distributed monitoring of thresholded counts
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
A geometric approach to monitoring threshold functions over distributed data streams
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Adaptive distributed monitoring with accuracy objectives
Proceedings of the 2006 SIGCOMM workshop on Internet network management
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Sparse data aggregation in sensor networks
Proceedings of the 6th international conference on Information processing in sensor networks
Sharing aggregate computation for distributed queries
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Streaming in a connected world: querying and tracking distributed data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Continuously maintaining order statistics over data streams: extended abstract
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
A geometric approach to monitoring threshold functions over distributed data streams
ACM Transactions on Database Systems (TODS)
Synopsis diffusion for robust aggregation in sensor networks
ACM Transactions on Sensor Networks (TOSN)
Algorithms for distributed functional monitoring
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
On-line discovery of hot motion paths
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Approximate continuous querying over distributed streams
ACM Transactions on Database Systems (TODS)
Shape sensitive geometric monitoring
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Asynchronous in-network prediction: Efficient aggregation in sensor networks
ACM Transactions on Sensor Networks (TOSN)
Robust and efficient aggregate query processing in wireless sensor networks
Mobile Networks and Applications
Efficiently Monitoring Nearest Neighbors to a Moving Object
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Proceedings of the VLDB Endowment
Multi-dimensional online tracking
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Making filters smart in distributed data stream environments
Information Sciences: an International Journal
Flooding-Assisted Threshold Assignment for Aggregate Monitoring in Sensor Networks
ICDCN '09 Proceedings of the 10th International Conference on Distributed Computing and Networking
Supporting asynchronous update for distributed data cubes
Journal of Network and Computer Applications
Optimal tracking of distributed heavy hitters and quantiles
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Ranking distributed probabilistic data
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Information discovery across multiple streams
Information Sciences: an International Journal
Cost-aware reactive monitoring in resource-constrained wireless sensor networks
WCNC'09 Proceedings of the 2009 IEEE conference on Wireless Communications & Networking Conference
Computing histograms of local variables for real-time monitoring using aggregation trees
IM'09 Proceedings of the 11th IFIP/IEEE international conference on Symposium on Integrated Network Management
Handling dynamics in diffusive aggregation schemes: An evaporative approach
Future Generation Computer Systems
Predictive modeling-based data collection in wireless sensor networks
EWSN'08 Proceedings of the 5th European conference on Wireless sensor networks
Aggregate computation over data streams
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Optimal sampling from distributed streams
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bernoulli sampling based (ε, δ)-approximate aggregation in large-scale sensor networks
INFOCOM'10 Proceedings of the 29th conference on Information communications
Algorithms for distributed functional monitoring
ACM Transactions on Algorithms (TALG)
Online tracking of the dominance relationship of distributed multi-dimensional data
WAOA'10 Proceedings of the 8th international conference on Approximation and online algorithms
A geometric approach to monitoring threshold functions over distributed data streams
Ubiquitous knowledge discovery
A geometric approach to monitoring threshold functions over distributed data streams
Ubiquitous knowledge discovery
Sampling based algorithms for quantile computation in sensor networks
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Tracking distributed aggregates over time-based sliding windows
Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Mining frequent itemsets over distributed data streams by continuously maintaining a global synopsis
Data Mining and Knowledge Discovery
Continuous distributed monitoring: a short survey
Proceedings of the First International Workshop on Algorithms and Models for Distributed Event Processing
Incremental aggregation on multiple continuous queries
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Lower bounds for number-in-hand multiparty communication complexity, made easy
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Approximate top-k queries in sensor networks
SIROCCO'06 Proceedings of the 13th international conference on Structural Information and Communication Complexity
Monitoring, aggregation and filtering for efficient management of virtual networks
Proceedings of the 7th International Conference on Network and Services Management
Multidimensional online tracking
ACM Transactions on Algorithms (TALG)
Continuous sampling from distributed streams
Journal of the ACM (JACM)
Self-organizing virtual macro sensors
ACM Transactions on Autonomous and Adaptive Systems (TAAS) - Special section on formal methods in pervasive computing, pervasive adaptation, and self-adaptive systems: Models and algorithms
Secure Distributed Data Aggregation
Foundations and Trends in Databases
Randomized algorithms for tracking distributed count, frequencies, and ranks
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Continuous distributed counting for non-monotonic streams
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Prediction-based geometric monitoring over distributed data streams
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Tight bounds for distributed functional monitoring
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Survey: Streaming techniques and data aggregation in networks of tiny artefacts
Computer Science Review
Sketch-based querying of distributed sliding-window data streams
Proceedings of the VLDB Endowment
An optimized in-network aggregation scheme for data collection in periodic sensor networks
ADHOC-NOW'12 Proceedings of the 11th international conference on Ad-hoc, Mobile, and Wireless Networks
A decentralized approach for mining event correlations in distributed system monitoring
Journal of Parallel and Distributed Computing
On contextual ranking queries in databases
Information Systems
Quantiles over data streams: an experimental study
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
The continuous distributed monitoring model
ACM SIGMOD Record
Sketch-based geometric monitoring of distributed stream queries
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
While traditional database systems optimize for performance on one-shot queries, emerging large-scale monitoring applications require continuous tracking of complex aggregates and data-distribution summaries over collections of physically-distributed streams. Thus, effective solutions have to be simultaneously space efficient (at each remote site), communication efficient (across the underlying communication network), and provide continuous, guaranteed-quality estimates. In this paper, we propose novel algorithmic solutions for the problem of continuously tracking complex holistic aggregates in such a distributed-streams setting --- our primary focus is on approximate quantile summaries, but our approach is more broadly applicable and can handle other holistic-aggregate functions (e.g., "heavy-hitters" queries). We present the first known distributed-tracking schemes for maintaining accurate quantile estimates with provable approximation guarantees, while simultaneously optimizing the storage space at each remote site as well as the communication cost across the network. In a nutshell, our algorithms employ a combination of local tracking at remote sites and simple prediction models for local site behavior in order to produce highly communication- and space-efficient solutions. We perform extensive experiments with real and synthetic data to explore the various tradeoffs and understand the role of prediction models in our schemes. The results clearly validate our approach, revealing significant savings over naive solutions as well as our analytical worst-case guarantees.