Randomized algorithms
The space complexity of approximating the frequency moments
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Hancock: a language for extracting signatures from data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
An Approximate L1-Difference Algorithm for Massive Data Streams
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Characterizing memory requirements for queries over continuous data streams
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Rate-based query optimization for streaming information sources
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Processing complex aggregate queries over data streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Distributed streams algorithms for sliding windows
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
RHist: adaptive summarization over continuous data streams
Proceedings of the eleventh international conference on Information and knowledge management
Fast Optimal Genome Tiling with Applications to Microarray Design and Homology Search
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Estimating Rarity and Similarity over Data Stream Windows
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Maintaining time-decaying stream aggregates
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
Issues in data stream management
ACM SIGMOD Record
Approximate join processing over data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Identifying frequent items in sliding windows over on-line packet streams
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
PSoup: a system for streaming queries over streaming data
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient elastic burst detection in data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
estWin: adaptively monitoring the recent change of frequent itemsets over online data streams
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Taming the underlying challenges of reliable multihop routing in sensor networks
Proceedings of the 1st international conference on Embedded networked sensor systems
Statistical grid-based clustering over data streams
ACM SIGMOD Record
Characterizing memory requirements for queries over continuous data streams
ACM Transactions on Database Systems (TODS)
Cost-efficient mining techniques for data streams
ACSW Frontiers '04 Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32
Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Deterministic sampling and range counting in geometric data streams
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Online event-driven subsequence matching over financial data streams
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Adaptive, unsupervised stream mining
The VLDB Journal — The International Journal on Very Large Data Bases
Finding hot query patterns over an XQuery stream
The VLDB Journal — The International Journal on Very Large Data Bases
Semantic Approximation of Data Stream Joins
IEEE Transactions on Knowledge and Data Engineering
Spatiotemporal Aggregate Computation: A Survey
IEEE Transactions on Knowledge and Data Engineering
Longest increasing subsequences in sliding windows
Theoretical Computer Science
Maintaining Implicated Statistics in Constrained Environments
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
estWin: Online data stream mining of recent frequent itemsets by sliding window method
Journal of Information Science
Duplicate detection in click streams
WWW '05 Proceedings of the 14th international conference on World Wide Web
Fast and approximate stream mining of quantiles and frequencies using graphics processors
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Subsequence matching on structured time series data
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
A native extension of SQL for mining data streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Domain-Driven Data Synopses for Dynamic Quantiles
IEEE Transactions on Knowledge and Data Engineering
Efficient mining method for retrieving sequential patterns over online data streams
Journal of Information Science
Sketching streams through the net: distributed approximate query tracking
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Summarizing and mining inverse distributions on data streams via dynamic inverse sampling
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Using association rules for fraud detection in web advertising networks
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Streaming pattern discovery in multiple time-series
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Maintaining significant stream statistics over sliding windows
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Rights Protection for Discrete Numeric Streams
IEEE Transactions on Knowledge and Data Engineering
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Online summarization of dynamic time series data
The VLDB Journal — The International Journal on Very Large Data Bases
Stable distributions, pseudorandom generators, embeddings, and data stream computation
Journal of the ACM (JACM)
DSM-PLW: single-pass mining of path traversal patterns over streaming web click-sequences
Computer Networks: The International Journal of Computer and Telecommunications Networking - Web dynamics
Online clustering of parallel data streams
Data & Knowledge Engineering
Adaptive Clustering for Multiple Evolving Streams
IEEE Transactions on Knowledge and Data Engineering
On biased reservoir sampling in the presence of stream evolution
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Mining sequential patterns from data streams: a centroid approach
Journal of Intelligent Information Systems
A data stream language and system designed for power and extensibility
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Temporal abstraction in intelligent clinical data analysis: A survey
Artificial Intelligence in Medicine
Deterministic sampling and range counting in geometric data streams
ACM Transactions on Algorithms (TALG)
A priority random sampling algorithm for time-based sliding windows over weighted streaming data
Proceedings of the 2007 ACM symposium on Applied computing
Efficient pebbling for list traversal synopses with application to program rollback
Theoretical Computer Science
Effective variation management for pseudo periodical streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Variance estimation over sliding windows
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously maintaining order statistics over data streams: extended abstract
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Cell trees: An adaptive synopsis structure for clustering multi-dimensional on-line data streams
Data & Knowledge Engineering
Streaming queries over streaming data
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A transducer-based XML query processor
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Reverse nearest neighbor aggregates over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate Query Processing in Cube Streams
IEEE Transactions on Knowledge and Data Engineering
MRST: an efficient monitoring technology of summarization on stream data
Journal of Computer Science and Technology
A regression-based temporal pattern mining scheme for data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Tuple routing strategies for distributed eddies
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Processing sliding window multi-joins in continuous queries over data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Adaptive, hands-off stream mining
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Memory-limited execution of windowed stream joins
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Resource sharing in continuous sliding-window aggregates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Resilient rights protection for sensor streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Grid-based subspace clustering over data streams
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Proof-infused streams: enabling authentication of sliding window queries on streams
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Incremental maintenance of generalized association rules under taxonomy evolution
Journal of Information Science
Why go logarithmic if we can go linear?: Towards effective distinct counting of search traffic
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Approximate continuous querying over distributed streams
ACM Transactions on Database Systems (TODS)
Efficient instance-based learning on data streams
Intelligent Data Analysis
Time-decaying aggregates in out-of-order streams
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Shape sensitive geometric monitoring
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Index tuning for parameterized streaming groupby queries
SSPS '08 Proceedings of the 2nd international workshop on Scalable stream processing system
Enhancing histograms by tree-like bucket indices
The VLDB Journal — The International Journal on Very Large Data Bases
Continuous Trend-Based Clustering in Data Streams
DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
Sliding-window top-k queries on uncertain streams
Proceedings of the VLDB Endowment
Finding frequent items in data streams
Proceedings of the VLDB Endowment
Feature-preserved sampling over streaming data
ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining frequent itemsets over data streams using efficient window sliding techniques
Expert Systems with Applications: An International Journal
ODMCA: An adaptive data mining control algorithm in multicarrier networks
Computer Communications
Semantics and implementation of continuous sliding window queries over data streams
ACM Transactions on Database Systems (TODS)
Efficiently tracing clusters over high-dimensional on-line data streams
Data & Knowledge Engineering
Privately detecting bursts in streaming, distributed time series data
Data & Knowledge Engineering
PGG: an online pattern based approach for stream variation management
Journal of Computer Science and Technology
Optimal sampling from sliding windows
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Counting Flows over Sliding Windows in High Speed Networks
NETWORKING '09 Proceedings of the 8th International IFIP-TC 6 Networking Conference
Online FCMAC-BYY Model with Sliding Window
ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part II
Finding the frequent items in streams of data
Communications of the ACM - A View of Parallel Computing
Small synopses for group-by query verification on outsourced data streams
ACM Transactions on Database Systems (TODS)
Frequency-based load shedding over a data stream of tuples
Information Sciences: an International Journal
The Frequent Items Problem, under Polynomial Decay, in the Streaming Model
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Incremental and Adaptive Clustering Stream Data over Sliding Window
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Methods for finding frequent items in data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Aggregate computation over data streams
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Finding heavy hitters over the sliding window of a weighted data stream
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
A near-optimal algorithm for estimating the entropy of a stream
ACM Transactions on Algorithms (TALG)
Density-based data streams clustering over sliding windows
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Fuzzy CMAC with incremental Bayesian Ying-Yang learning and dynamic rule construction
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Sliding-window top-k queries on uncertain streams
The VLDB Journal — The International Journal on Very Large Data Bases
The frequent items problem, under polynomial decay, in the streaming model
Theoretical Computer Science
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
Mining discriminative items in multiple data streams
World Wide Web
Counting distinct objects over sliding windows
ADC '10 Proceedings of the Twenty-First Australasian Conference on Database Technologies - Volume 104
Spatiotemporal summarization of traffic data streams
Proceedings of the ACM SIGSPATIAL International Workshop on GeoStreaming
Computer Networks: The International Journal of Computer and Telecommunications Networking
Optimal sampling from sliding windows
Journal of Computer and System Sciences
Distinct estimate of set expressions over sliding windows
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Improved algorithms for polynomial-time decay and time-decay with additive error
ICTCS'05 Proceedings of the 9th Italian conference on Theoretical Computer Science
GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
User subjectivity in change modeling of streaming itemsets
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Querying sliding windows over online data streams
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Continuous trend-based classification of streaming time series
ADBIS'05 Proceedings of the 9th East European conference on Advances in Databases and Information Systems
A false negative approach to mining frequent itemsets from high speed transactional data streams
Information Sciences: an International Journal
Streaming data reduction using low-memory factored representations
Information Sciences: an International Journal
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Tracking distributed aggregates over time-based sliding windows
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Duplicate detection in pay-per-click streams using temporal stateful Bloom filters
International Journal of Data Analysis Techniques and Strategies
Rare pattern mining on data streams
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Private decayed predicate sums on streams
Proceedings of the 16th International Conference on Database Theory
Pattern discovery in data streams under the time warping distance
The VLDB Journal — The International Journal on Very Large Data Bases
Indexing for summary queries: Theory and practice
ACM Transactions on Database Systems (TODS)
Efficient and effective realtime prediction of drive-by download attacks
Journal of Network and Computer Applications
Hi-index | 0.01 |
We consider the problem of maintaining aggregates and statistics over data streams, with respect to the last N data elements seen so far. We refer to this model as the sliding window model. We consider the following basic problem: Given a stream of bits, maintain a count of the number of 1's in the last N elements seen from the stream. We show that using O(1/e log2N) bits of memory, we can estimate the number of 1's to within a factor of 1 + ε. We also give a matching lower bound of Ω(1/e log2 N) memory bits for any deterministic or randomized algorithms. We extend our scheme to maintain the sum of the last N positive integers. We provide matching upper and lower bounds for this more general problem as well. We apply our techniques to obtain efficient algorithms for the Lp norms (for p ε [1, 2]) of vectors under the sliding window model. Using the algorithm for the basic counting problem, one can adapt many other techniques to work for the sliding window model, with a multiplicative overhead of O(1/εlog N) in memory and a 1 + ε factor loss in accuracy. These include maintaining approximate histograms, hash tables, and statistics or aggregates such as sum and averages.