Max-FISM: Mining (recently) maximal frequent itemsets over data streams using the sliding window model

Authors:
Zahra Farzanyar;Mohammadreza Kangavari;Nick Cercone
Affiliations:
Department of Computer Engineering, Iran University of Science & Technology,Tehran, Iran;Department of Computer Engineering, Iran University of Science & Technology,Tehran, Iran;Department of Computer Science and Engineering, York University, Toronto, Canada
Venue:
Computers & Mathematics with Applications
Year:
2012

Citing 16
Cited 2

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Real world performance of association rule algorithms

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Sliding-window filtering: an efficient algorithm for incremental mining

Proceedings of the tenth international conference on Information and knowledge management
Querying and mining data streams: you only get one look a tutorial

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Finding recent frequent itemsets adaptively over online data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Catch the moment: maintaining closed frequent itemsets over a data stream sliding window

Knowledge and Information Systems
Mining maximal frequent itemsets from data streams

Journal of Information Science
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
StatStream: statistical monitoring of thousands of data streams in real time

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
estMax: Tracing Maximal Frequent Itemsets over Online Data Streams

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
DSM-FI: an efficient algorithm for mining frequent itemsets in data streams

Knowledge and Information Systems
Mining frequent itemsets over data streams using efficient window sliding techniques

Expert Systems with Applications: An International Journal
A false negative approach to mining frequent itemsets from high speed transactional data streams

Information Sciences: an International Journal

P2P-FISM: Mining (recently) frequent item sets from distributed data streams over P2P network

Information Processing Letters
Sliding window based weighted maximal frequent pattern mining over data streams

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.09

Visualization

Abstract

Frequent itemset mining from data streams is an important data mining problem with broad applications such as retail market data analysis, network monitoring, web usage mining, and stock market prediction. However, it is also a difficult problem due to the unbounded, high-speed and continuous characteristics of streaming data. Therefore, extracting frequent itemsets from more recent data can enhance the analysis of stream data. In this paper, we propose an efficient algorithm, called Max-FISM (Maximal-Frequent Itemsets Mining), for mining recent maximal frequent itemsets from a high-speed stream of transactions within a sliding window. According to our algorithm, whenever a new transaction is inserted in the current window only its maximum itemset should be inserted into a prefix tree-based summary data structure called Max-Set for maintaining the number of independent appearance of each transaction in the current window. Finally, the set of recent maximal frequent itemsets is obtained from the current Max-Set. Experimental studies show that the proposed Max-FISM algorithm is highly efficient in terms of memory and time complexity for mining recent maximal frequent itemsets over high-speed data streams.