A linear-time probabilistic counting algorithm for database applications
ACM Transactions on Database Systems (TODS)
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Computing Iceberg Queries Efficiently
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice
ACM Transactions on Computer Systems (TOCS)
Evaluation of Sampling for Data Mining of Association Rules
Evaluation of Sampling for Data Mining of Association Rules
Identifying frequent items in sliding windows over on-line packet streams
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Finding recent frequent itemsets adaptively over online data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamically maintaining frequent items over a data stream
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Finding Frequent Items in Sliding Windows with Multinomially-Distributed Item Frequencies
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining top-K frequent itemsets from data streams
Data Mining and Knowledge Discovery
Density-based clustering for real-time stream data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Deterministic algorithms for sampling count data
Data & Knowledge Engineering
Mining Frequent Itemsets in a Stream
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Frequent items in streaming data: An experimental evaluation of the state-of-the-art
Data & Knowledge Engineering
Frequency Estimation over Sliding Windows
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Stream data clustering based on grid density and attraction
ACM Transactions on Knowledge Discovery from Data (TKDD)
Finding the frequent items in streams of data
Communications of the ACM - A View of Parallel Computing
An Algorithm for Mining Frequent Items on Data Stream Using Fading Factor
COMPSAC '09 Proceedings of the 2009 33rd Annual IEEE International Computer Software and Applications Conference - Volume 02
Frequent Items Mining on Data Stream Based on Time Fading Factor
AICI '09 Proceedings of the 2009 International Conference on Artificial Intelligence and Computational Intelligence - Volume 04
A deterministic algorithm for summarizing asynchronous streams over a sliding window
STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Mining top-k frequent items in a data stream with flexible sliding windows
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding top-k elements in data streams
Information Sciences: an International Journal
Identifying frequent items in a network using gossip
Journal of Parallel and Distributed Computing
An Ω(1/ε log 1/ε) space lower bound for finding ε-approximate quantiles in a data stream
FAW'10 Proceedings of the 4th international conference on Frontiers in algorithmics
Ranking uncertain sky: The probabilistic top-k skyline operator
Information Systems
Best position algorithms for efficient top-k query processing
Information Systems
Error-adaptive and time-aware maintenance of frequency counts over data streams
WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
P-top-k queries in a probabilistic framework from information extraction models
Computers & Mathematics with Applications
Hi-index | 0.07 |
We investigate the problem of finding frequent items in a continuous data stream, and present an algorithm named @l-HCount for computing frequency counts of stream data based on a time fading model. The algorithm uses r hash functions to estimate the density values of stream data items. To emphasize the importance of recent data items, a time fading factor is used. For a given error bound, our algorithm can detect approximate frequent items under a certain probability using limited number of memory space. The memory requirement only depends on the number of different data items and the number of hash functions used. Experimental results on synthetic and real data sets show that our algorithm outperforms other methods in terms of accuracy, memory requirement, and processing speed.