Maintaining Stream Statistics over Sliding Windows
SIAM Journal on Computing
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Correlating synchronous and asynchronous data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Flexible time management in data stream systems
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Space- and time-efficient deterministic algorithms for biased quantiles over data streams
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sketching asynchronous streams over a sliding window
Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Time-decaying sketches for sensor data aggregation
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Time-decaying aggregates in out-of-order streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
Time-decaying Sketches for Robust Aggregation of Sensor Data
SIAM Journal on Computing
Approximating frequent items in asynchronous data stream over a sliding window
WAOA'09 Proceedings of the 7th international conference on Approximation and Online Algorithms
Sketch-based querying of distributed sliding-window data streams
Proceedings of the VLDB Endowment
Mining frequent items in data stream using time fading model
Information Sciences: an International Journal
Hi-index | 0.00 |
We consider the problem of maintaining aggregates over recent elements of a massive data stream. Motivated by applications involving network data, we consider asynchronous data streams, where the observed order of data may be different from the order in which the data was generated. The set of recent elements is modeled as a sliding timestamp window of the stream, whose elements are changing continuously with time. We present the first deterministic algorithms for maintaining a small space summary of elements in a sliding timestamp window of an asynchronous data stream. The summary can return approximate answers for the following fundamental aggregates: basic count, the number of elements within the sliding window, and sum, the sum of all element values within the sliding window. For basic counting, the space taken by our summary is O(logW ċ log B ċ (log W + log B)/Ɛ) bits, where B is an upper bound on the value of the basic count, W is an upper bound on the width of the timestamp window, and Ɛ is the desired relative error. Our algorithms are based on a novel data structure called splittable histogram. Prior to this work, randomized algorithms were known for this problem, which provide weaker guarantees than those provided by our deterministic algorithms.