Using association rules for product assortment decisions: a case study
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Lots o'Ticks: real time high performance time series queries on billions of trades and quotes
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
KDD-Cup 2000 organizers' report: peeling the onion
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Fast, small-space algorithms for approximate histogram maintenance
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
New directions in traffic measurement and accounting
ACM SIGCOMM Computer Communication Review
Applications of Data Mining to Electronic Commerce
Data Mining and Knowledge Discovery
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
Finding recent frequent itemsets adaptively over online data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamically maintaining frequent items over a data stream
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Online Amnesic Approximation of Streaming Time Series
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
What's hot and what's not: tracking most frequent items dynamically
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Finding Maximal Frequent Itemsets over Online Data Streams Adaptively
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
An integrated efficient solution for computing frequent and top-k elements in data streams
ACM Transactions on Database Systems (TODS)
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Data Streams: Models and Algorithms (Advances in Database Systems)
Data Streams: Models and Algorithms (Advances in Database Systems)
Multi-dimensional regression analysis of time-series data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Streaming Time Series Summarization Using User-Defined Amnesic Functions
IEEE Transactions on Knowledge and Data Engineering
A survey on algorithms for mining frequent itemsets over data streams
Knowledge and Information Systems
Efficiently Discovering Recent Frequent Items in Data Streams
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Mining frequent itemsets over data streams using efficient window sliding techniques
Expert Systems with Applications: An International Journal
Frequent items in streaming data: An experimental evaluation of the state-of-the-art
Data & Knowledge Engineering
Enhancing SWF for incremental association mining by itemset maintenance
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
Mining top-k frequent items in a data stream with flexible sliding windows
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards a variable size sliding window model for frequent itemset mining over data streams
Computers and Industrial Engineering
Sketch-based querying of distributed sliding-window data streams
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The problem of frequent item discovery in streaming data has attracted a lot of attention, mainly because of its numerous applications in diverse domains, such as network traffic monitoring and e-business transactions analysis. While the above problem has been studied extensively, and several techniques have been proposed for its solution, these approaches are geared towards the recent values in the stream. Nevertheless, in several situations the users would like to be able to query about the item frequencies in ad hoc windows in the stream history, and compare these values among themselves. In this paper, we address the problem of finding frequent items in ad hoc windows in a data stream given a small bounded memory, and present novel algorithms to this direction. We propose basic sketch- and count-based algorithms that extend the functionality of existing approaches by monitoring item frequencies in the stream. Subsequently, we present an improved version of the algorithm with significantly better performance (in terms of accuracy, at no extra memory cost). Moreover, we propose an efficient non-linear model to better estimate the frequencies within the query windows. Finally, we conduct an extensive experimental evaluation with synthetic and real datasets, which demonstrates the merits of the proposed solutions and provides guidelines for the practitioners in the field.