Continuous queries over append-only databases
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Using Geometric Distance Fits for 3-D Object Modeling and Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Estimating simple functions on the union of data streams
Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Distributed streams algorithms for sliding windows
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Global Optimization with Polynomials and the Problem of Moments
SIAM Journal on Optimization
International Journal of Computer Vision
Continuous queries over data streams
ACM SIGMOD Record
Continual Queries for Internet Scale Event-Driven Information Delivery
IEEE Transactions on Knowledge and Data Engineering
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Online Data Mining for Co-Evolving Time Sequences
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive filters for continuous queries over distributed data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a data stream management system
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Fjording the Stream: An Architecture for Queries Over Streaming Sensor Data
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Distributed Data Streams Indexing using Content-Based Routing Paradigm
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Topology preserving surface extraction using adaptive subdivision
Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on Geometry processing
Holistic aggregates in a networked world: distributed tracking of approximate quantiles
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Communication-efficient distributed monitoring of thresholded counts
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
A geometric approach to monitoring threshold functions over distributed data streams
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Toward sophisticated detection with distributed triggers
Proceedings of the 2006 SIGCOMM workshop on Mining network data
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Functional Monitoring without Monotonicity
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Top-k vectorial aggregation queries in a distributed environment
Journal of Parallel and Distributed Computing
Distributed threshold querying of general functions by a difference of monotonic representation
Proceedings of the VLDB Endowment
Prediction-based geometric monitoring over distributed data streams
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Efficient distributed computation of human mobility aggregates through user mobility profiles
Proceedings of the ACM SIGKDD International Workshop on Urban Computing
Security and protection of SCADA: a bigdata algorithmic approach
Proceedings of the 6th International Conference on Security of Information and Networks
Ratio threshold queries over distributed data sources
Proceedings of the VLDB Endowment
Data management research at the technical university of crete
ACM SIGMOD Record
Hi-index | 0.00 |
Monitoring data streams in a distributed system is the focus of much research in recent years. Most of the proposed schemes, however, deal with monitoring simple aggregated values, such as the frequency of appearance of items in the streams. More involved challenges, such as the important task of feature selection (e.g., by monitoring the information gain of various features), still require very high communication overhead using naive, centralized algorithms. We present a novel geometric approach which reduces monitoring the value of a function (vis-à-vis a threshold) to a set of constraints applied locally on each of the streams. The constraints are used to locally filter out data increments that do not affect the monitoring outcome, thus avoiding unnecessary communication. As a result, our approach enables monitoring of arbitrary threshold functions over distributed data streams in an efficient manner. We present experimental results on real-world data which demonstrate that our algorithms are highly scalable, and considerably reduce communication load in comparison to centralized algorithms.