Approximate mining of maximal frequent itemsets in data streams with different window models

Authors:
Hua-Fu Li;Suh-Yin Lee
Affiliations:
Department of Computer Science, Kainan University, No.1 Kainan Road, Luzhu Shiang, Taoyuan 338, Taiwan, ROC;Department of Computer Science, National Chiao-Tung University, 1001 Ta-Hsueh Road, Hsinchu 300, Taiwan, ROC
Venue:
Expert Systems with Applications: An International Journal
Year:
2008

Citing 20
Cited 7

Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining data streams under block evolution

ACM SIGKDD Explorations Newsletter
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases

Proceedings of the 17th International Conference on Data Engineering
Efficiently Mining Maximal Frequent Itemsets

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Clustering data streams

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
SmartMiner: A Depth First Algorithm Guided by Tail Information for Mining Maximal Frequent Itemsets

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
A framework for diagnosing changes in evolving data streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Mining concept-drifting data streams using ensemble classifiers

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
estWin: Online data stream mining of recent frequent itemsets by sliding window method

Journal of Information Science
Efficient mining method for retrieving sequential patterns over online data streams

Journal of Information Science
DSM-PLW: single-pass mining of path traversal patterns over streaming web click-sequences

Computer Networks: The International Journal of Computer and Telecommunications Networking - Web dynamics
Multi-dimensional regression analysis of time-series data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A framework for clustering evolving data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A regression-based temporal pattern mining scheme for data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Online mining maximal frequent structures in continuous landmark melody streams

Pattern Recognition Letters

Interactive mining of top-K frequent closed itemsets from data streams

Expert Systems with Applications: An International Journal
Experimental study on fighters behaviors mining

Expert Systems with Applications: An International Journal
Incremental mining of closed inter-transaction itemsets over data stream sliding windows

Journal of Information Science
Efficient prime-based method for interactive mining of frequent patterns

Expert Systems with Applications: An International Journal
Interactive stream mining of maximal frequent itemsets allowing flexible time intervals and support thresholds

Proceedings of the 4th International Conference on Uniquitous Information Management and Communication
Mining frequent patterns in a varying-size sliding window of online transactional data streams

Information Sciences: an International Journal
Mining maximal frequent patterns by considering weight conditions over data streams

Knowledge-Based Systems

Quantified Score

Hi-index	12.06

Visualization

Abstract

A data stream is a massive, open-ended sequence of data elements continuously generated at a rapid rate. Mining data streams is more difficult than mining static databases because the huge, high-speed and continuous characteristics of streaming data. In this paper, we propose a new one-pass algorithm called DSM-MFI (stands for Data Stream Mining for Maximal Frequent Itemsets), which mines the set of all maximal frequent itemsets in landmark windows over data streams. A new summary data structure called summary frequent itemset forest (abbreviated as SFI-forest) is developed for incremental maintaining the essential information about maximal frequent itemsets embedded in the stream so far. Theoretical analysis and experimental studies show that the proposed algorithm is efficient and scalable for mining the set of all maximal frequent itemsets over the entire history of the data streams.