On the self-similar nature of Ethernet traffic (extended version)
IEEE/ACM Transactions on Networking (TON)
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Adaptive filters for continuous queries over distributed data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Power-conserving computation of order-statistics over sensor networks
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Holistic aggregates in a networked world: distributed tracking of approximate quantiles
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Sketching streams through the net: distributed approximate query tracking
VLDB '05 Proceedings of the 31st international conference on Very large data bases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Algorithms for distributed functional monitoring
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal tracking of distributed heavy hitters and quantiles
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Functional Monitoring without Monotonicity
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Optimal sampling from distributed streams
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Optimal random sampling from distributed streams revisited
DISC'11 Proceedings of the 25th international conference on Distributed computing
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
The continuous distributed monitoring model
ACM SIGMOD Record
Report on the first workshop on innovative querying of streams
ACM SIGMOD Record
Hi-index | 0.00 |
We consider the continual count tracking problem in a distributed environment where the input is an aggregate stream that originates from k distinct sites and the updates are allowed to be non-monotonic, i.e. both increments and decrements are allowed. The goal is to continually track the count within a prescribed relative accuracy ε at the lowest possible communication cost. Specifically, we consider an adversarial setting where the input values are selected and assigned to sites by an adversary but the order is according to a random permutation or is a random i.i.d process. The input stream of values is allowed to be non-monotonic with an unknown drift -1≤μ=1 where the case μ = 1 corresponds to the special case of a monotonic stream of only non-negative updates. We show that a randomized algorithm guarantees to track the count accurately with high probability and has the expected communication cost Õ(min√k/(|#956;|ε), √k n/ε, n}), for an input stream of length n, and establish matching lower bounds. This improves upon previously best known algorithm whose expected communication cost is Θ(min√k/ε,n]) that applies only to an important but more restrictive class of monotonic input streams, and our results are substantially more positive than the communication complexity of Ω(n) under fully adversarial input. We also show how our framework can also accommodate other types of random input streams, including fractional Brownian motion that has been widely used to model temporal long-range dependencies observed in many natural phenomena. Last but not least, we show how our non-monotonic counter can be applied to track the second frequency moment and to a Bayesian linear regression problem.