The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Maintaining stream statistics over sliding windows: (extended abstract)
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
Proceedings of the 27th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
The design of an acquisitional query processor for sensor networks
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive filters for continuous queries over distributed data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Power-conserving computation of order-statistics over sensor networks
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Holistic aggregates in a networked world: distributed tracking of approximate quantiles
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Distributed set-expression cardinality estimation
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Secure distributed data-mining and its application to large-scale network measurements
ACM SIGCOMM Computer Communication Review
Communication-efficient distributed monitoring of thresholded counts
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Online outlier detection in sensor data using non-parametric models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
An energy-efficient querying framework in sensor networks for detecting node similarities
Proceedings of the 9th ACM international symposium on Modeling analysis and simulation of wireless and mobile systems
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Discovering and exploiting keyword and attribute-value co-occurrences to improve P2P routing indices
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Optimized query planning of continuous aggregation queries in dynamic data dissemination networks
Proceedings of the 16th international conference on World Wide Web
Statistical analysis of sketch estimators
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Sharing aggregate computation for distributed queries
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Streaming in a connected world: querying and tracking distributed data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Algorithms for distributed functional monitoring
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Approximate continuous querying over distributed streams
ACM Transactions on Database Systems (TODS)
Shape sensitive geometric monitoring
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sketches for size of join estimation
ACM Transactions on Database Systems (TODS)
Exploiting correlated keywords to improve approximate information filtering
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
FIDS: Monitoring Frequent Items over Distributed Data Streams
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Efficiently Monitoring Nearest Neighbors to a Moving Object
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
GS-TMS: a global stream-based threat monitor system
Proceedings of the VLDB Endowment
ACM Transactions on Computer Systems (TOCS)
Multi-dimensional online tracking
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Making filters smart in distributed data stream environments
Information Sciences: an International Journal
Flooding-Assisted Threshold Assignment for Aggregate Monitoring in Sensor Networks
ICDCN '09 Proceedings of the 10th International Conference on Distributed Computing and Networking
Optimal tracking of distributed heavy hitters and quantiles
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Query Recommendations for Interactive Database Exploration
SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Cost-aware reactive monitoring in resource-constrained wireless sensor networks
WCNC'09 Proceedings of the 2009 IEEE conference on Wireless Communications & Networking Conference
Aggregate computation over data streams
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Research on cost-efficient processing of continuous extreme queires
IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
A sparse Johnson: Lindenstrauss transform
Proceedings of the forty-second ACM symposium on Theory of computing
Optimal sampling from distributed streams
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast Manhattan sketches in data streams
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
Algorithms for distributed functional monitoring
ACM Transactions on Algorithms (TALG)
Tracking distributed aggregates over time-based sliding windows
Proceedings of the 30th annual ACM SIGACT-SIGOPS symposium on Principles of distributed computing
Mining frequent itemsets over distributed data streams by continuously maintaining a global synopsis
Data Mining and Knowledge Discovery
Continuous distributed monitoring: a short survey
Proceedings of the First International Workshop on Algorithms and Models for Distributed Event Processing
Optimal random sampling from distributed streams revisited
DISC'11 Proceedings of the 25th international conference on Distributed computing
Lower bounds for number-in-hand multiparty communication complexity, made easy
Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Practical algorithms for tracking database join sizes
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Multidimensional online tracking
ACM Transactions on Algorithms (TALG)
Continuous sampling from distributed streams
Journal of the ACM (JACM)
Research on prediction models over distributed data streams
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Continuous distributed counting for non-monotonic streams
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Prediction-based geometric monitoring over distributed data streams
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Tight bounds for distributed functional monitoring
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Survey: Streaming techniques and data aggregation in networks of tiny artefacts
Computer Science Review
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Efficient distributed computation of human mobility aggregates through user mobility profiles
Proceedings of the ACM SIGKDD International Workshop on Urban Computing
CR-PRECIS: a deterministic summary structure for update data streams
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
The continuous distributed monitoring model
ACM SIGMOD Record
Hi-index | 0.00 |
Emerging large-scale monitoring applications require continuous tracking of complex data-analysis queries over collections of physically-distributed streams. Effective solutions have to be simultaneously space/time efficient (at each remote monitor site), communication efficient (across the underlying communication network), and provide continuous, guaranteed-quality approximate query answers. In this paper, we propose novel algorithmic solutions for the problem of continuously tracking a broad class of complex aggregate queries in such a distributed-streams setting. Our tracking schemes maintain approximate query answers with provable error guarantees, while simultaneously optimizing the storage space and processing time at each remote site, and the communication cost across the network. They rely on tracking general-purpose randomized sketch summaries of local streams at remote sites along with concise prediction models of local site behavior in order to produce highly communication- and space/time-efficient solutions. The result is a powerful approximate query tracking framework that readily incorporates several complex analysis queries (including distributed join and multi-join aggregates, and approximate wavelet representations), thus giving the first known low-overhead tracking solution for such queries in the distributed-streams model.