Mining frequent items in data stream using time fading model

Authors:
Ling Chen;Qingling Mei
Affiliations:
-;-
Venue:
Information Sciences: an International Journal
Year:
2014

Citing 35
Cited 0

A linear-time probabilistic counting algorithm for database applications

ACM Transactions on Database Systems (TODS)
New sampling-based summary statistics for improving approximate query answers

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Computing Iceberg Queries Efficiently

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams

ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags

ACM Transactions on Database Systems (TODS)
New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice

ACM Transactions on Computer Systems (TOCS)
Evaluation of Sampling for Data Mining of Association Rules

Evaluation of Sampling for Data Mining of Association Rules
Identifying frequent items in sliding windows over on-line packet streams

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Finding recent frequent itemsets adaptively over online data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamically maintaining frequent items over a data stream

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Finding Frequent Items in Sliding Windows with Multinomially-Distributed Item Frequencies

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Approximate counts and quantiles over sliding windows

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A simpler and more efficient deterministic scheme for finding frequent items over sliding windows

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining top-K frequent itemsets from data streams

Data Mining and Knowledge Discovery
Density-based clustering for real-time stream data

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Deterministic algorithms for sampling count data

Data & Knowledge Engineering
Mining Frequent Itemsets in a Stream

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Frequent items in streaming data: An experimental evaluation of the state-of-the-art

Data & Knowledge Engineering
Frequency Estimation over Sliding Windows

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Stream data clustering based on grid density and attraction

ACM Transactions on Knowledge Discovery from Data (TKDD)
Finding the frequent items in streams of data

Communications of the ACM - A View of Parallel Computing
An Algorithm for Mining Frequent Items on Data Stream Using Fading Factor

COMPSAC '09 Proceedings of the 2009 33rd Annual IEEE International Computer Software and Applications Conference - Volume 02
Frequent Items Mining on Data Stream Based on Time Fading Factor

AICI '09 Proceedings of the 2009 International Conference on Artificial Intelligence and Computational Intelligence - Volume 04
A deterministic algorithm for summarizing asynchronous streams over a sliding window

STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Mining top-k frequent items in a data stream with flexible sliding windows

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding top-k elements in data streams

Information Sciences: an International Journal
Identifying frequent items in a network using gossip

Journal of Parallel and Distributed Computing
An Ω(1/ε log 1/ε) space lower bound for finding ε-approximate quantiles in a data stream

FAW'10 Proceedings of the 4th international conference on Frontiers in algorithmics
Ranking uncertain sky: The probabilistic top-k skyline operator

Information Systems
Best position algorithms for efficient top-k query processing

Information Systems
Error-adaptive and time-aware maintenance of frequency counts over data streams

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
P-top-k queries in a probabilistic framework from information extraction models

Computers & Mathematics with Applications

Quantified Score

Hi-index	0.07

Visualization

Abstract

We investigate the problem of finding frequent items in a continuous data stream, and present an algorithm named @l-HCount for computing frequency counts of stream data based on a time fading model. The algorithm uses r hash functions to estimate the density values of stream data items. To emphasize the importance of recent data items, a time fading factor is used. For a given error bound, our algorithm can detect approximate frequent items under a certain probability using limited number of memory space. The memory requirement only depends on the number of different data items and the number of hash functions used. Experimental results on synthetic and real data sets show that our algorithm outperforms other methods in terms of accuracy, memory requirement, and processing speed.