Efficient estimation algorithms for neighborhood variance and other moments
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Spatially-decaying aggregation over a network: model and algorithms
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Range-Efficient Computation of F" over Massive Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Maintaining time-decaying stream aggregates
Journal of Algorithms
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sketching asynchronous streams over a sliding window
Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Maintaining stream statistics over multiscale sliding windows
ACM Transactions on Database Systems (TODS)
Spatially-decaying aggregation over a network
Journal of Computer and System Sciences
Time-decaying sketches for sensor data aggregation
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Sampling time-based sliding windows in bounded space
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Mining adaptively frequent closed unlabeled rooted trees in data streams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Maintaining the Maximum Normalized Mean and Applications in Data Stream Mining
ADMA '08 Proceedings of the 4th international conference on Advanced Data Mining and Applications
Categorized Sliding Window in Streaming Data Management Systems
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Program locality analysis using reuse distance
ACM Transactions on Programming Languages and Systems (TOPLAS)
Brahms: Byzantine resilient random membership sampling
Computer Networks: The International Journal of Computer and Telecommunications Networking
Competitive Analysis of Aggregate Max in Windowed Streaming
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Dynamically Maintaining Duplicate-Insensitive and Time-Decayed Sum Using Time-Decaying Bloom Filter
ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
Adaptive Learning from Evolving Data Streams
IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Adaptive XML Tree Classification on Evolving Data Streams
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Evaluating top-k queries over incomplete data streams
Proceedings of the 18th ACM conference on Information and knowledge management
Maintaining time-decaying stream aggregates
Journal of Algorithms
Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
Proceedings of the 2010 conference on Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
A deterministic algorithm for summarizing asynchronous streams over a sliding window
STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Online discovery and maintenance of time series motifs
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding top-k elements in data streams
Information Sciences: an International Journal
Estimating top-k destinations in data streams
IPMU'10 Proceedings of the Computational intelligence for knowledge-based systems design, and 13th international conference on Information processing and management of uncertainty
Dispersion estimates for telecommunications fraud
IPMU'10 Proceedings of the Computational intelligence for knowledge-based systems design, and 13th international conference on Information processing and management of uncertainty
Mining frequent closed trees in evolving data streams
Intelligent Data Analysis - Ubiquitous Knowledge Discovery
A learning automata based solution to service selection in stochastic environments
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
Fast and accurate computation of equi-depth histograms over data streams
Proceedings of the 14th International Conference on Extending Database Technology
Time-decaying Sketches for Robust Aggregation of Sensor Data
SIAM Journal on Computing
Effective Computations on Sliding Windows
SIAM Journal on Computing
Space lower bounds for online pattern matching
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Mining frequent closed graphs on evolving data streams
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
MOA-TweetReader: real-time analysis in Twitter streaming data
DS'11 Proceedings of the 14th international conference on Discovery science
Supporting qos monitoring in virtual organisations
ICSOC'06 Proceedings of the 4th international conference on Service-Oriented Computing
Kalman filters and adaptive windows for learning in data streams
DS'06 Proceedings of the 9th international conference on Discovery Science
Maintaining moving sums over data streams
ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Approximate range mode and range median queries
STACS'05 Proceedings of the 22nd annual conference on Theoretical Aspects of Computer Science
Adaptive spatial partitioning for multidimensional data streams
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Approximating frequent items in asynchronous data stream over a sliding window
WAOA'09 Proceedings of the 7th international conference on Approximation and Online Algorithms
Service selection in stochastic environments: a learning-automaton based solution
Applied Intelligence
A sliding window-based false-negative approach for ubiquitous data stream analysis
International Journal of Communication Systems
Survey: Streaming techniques and data aggregation in networks of tiny artefacts
Computer Science Review
Sketch-based querying of distributed sliding-window data streams
Proceedings of the VLDB Endowment
Competitive analysis of maintaining frequent items of a stream
SWAT'12 Proceedings of the 13th Scandinavian conference on Algorithm Theory
Triggers and Monitoring in Intelligent Personal Health Record
Journal of Medical Systems
Parikh matching in the streaming model
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
On evaluating stream learning algorithms
Machine Learning
Space lower bounds for online pattern matching
Theoretical Computer Science
Mining frequent itemsets over tuple-evolving data streams
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Recentness biased learning for time series forecasting
Information Sciences: an International Journal
Fast computation of approximate biased histograms on sliding windows over data streams
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
STRIP: stream learning of influence probabilities
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
A survey on concept drift adaptation
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model. We consider the following basic problem: Given a stream of bits, maintain a count of the number of 1's in the last N elements seen from the stream. We show that, using $O(\frac{1}{\epsilon} \log^2 N)$ bits of memory, we can estimate the number of 1's to within a factor of $1 + \epsilon$. We also give a matching lower bound of $\Omega(\frac{1}{\epsilon}\log^2 N)$ memory bits for any deterministic or randomized algorithms. We extend our scheme to maintain the sum of the last N positive integers and provide matching upper and lower bounds for this more general problem as well. We also show how to efficiently compute the Lp norms ($p \in [1,2]$) of vectors in the sliding window model using our techniques. Using our algorithm, one can adapt many other techniques to work for the sliding window model with a multiplicative overhead of $O(\frac{1}{\epsilon}\log N)$ in memory and a $1 +\epsilon$ factor loss in accuracy. These include maintaining approximate histograms, hash tables, and statistics or aggregates such as sum and averages.