A new algorithm for mining global frequent itemsets in a stream

Authors:
Lichao Guo;Hongye Su;Yu Qu
Affiliations:
State Key Lab. Of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, P. R. China;State Key Lab. Of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, P. R. China;State Key Lab. Of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou, P. R. China
Venue:
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Year:
2009

Citing 10
Cited 0

Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Identifying frequent items in sliding windows over on-line packet streams

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
An Algorithm for In-Core Frequent Itemset Mining on Streaming Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Finding Maximal Frequent Itemsets over Online Data Streams Adaptively

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mining top-K frequent itemsets from data streams

Data Mining and Knowledge Discovery
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
False positive or false negative: mining frequent itemsets from high speed transactional data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Mining frequent items in a stream using flexible windows

Intelligent Data Analysis - Knowledge Discovery from Data Streams
Mining Frequent Itemsets in a Stream

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

To find global frequent itemsets in a multiple, continuous, rapid and time-varying data stream, a fast, incremental, real-time, and little-memory-cost algorithm should be used. Based on the max-frequency window model, a BHS summary structure and a novel algorithm called GGFI-MFW are proposed. It merely updates the summaries for subsets of the data new arrived and could directly generate the max-frequency for a given itemset without scanning the whole summary. Experiment results indicate that the proposed algorithm could efficiently find global frequent itemsets over a data stream with a small memory and perform overwhelming superiority for a large number of distinct items.