Mining maximal frequent itemsets from data streams

Authors:
Guojun Mao; Xindong Wu; Xingquan Zhu; Gong Chen; Chunnian Liu
Affiliations:
Department of Computer Science, University of Vermont,Burlington VT 05405, USA;Department of Computer Science, University of Vermont,Burlington VT 05405, USA;Department of Computer Science, University of Vermont,Burlington VT 05405, USA;Department of Computer Science, University of Vermont,Burlington VT 05405, USA;School of Computer Science, Beijing University of Technology,Beijing 100022, P.R. China
Venue:
Journal of Information Science
Year:
2007

Citing 10
Cited 8

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Online Algorithms for Mining Semi-structured Data Stream

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Finding recent frequent itemsets adaptively over online data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A regression-based temporal pattern mining scheme for data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Encoding probability propagation in belief networks

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

A semi-random multiple decision-tree algorithm for mining data streams

Journal of Computer Science and Technology
Mining frequent closed itemsets from a landmark window over online data streams

Computers & Mathematics with Applications
Mining non-derivable frequent itemsets over data stream

Data & Knowledge Engineering
A false negative maximal frequent itemset mining algorithm over stream

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Max-FISM: Mining (recently) maximal frequent itemsets over data streams using the sliding window model

Computers & Mathematics with Applications
Stream mining on univariate uncertain data

Applied Intelligence
Mining maximal frequent patterns by considering weight conditions over data streams

Knowledge-Based Systems
Efficient mining of maximal correlated weight frequent patterns

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent pattern mining from data streams is an active research topic in data mining. Existing research efforts often rely on a two-phase framework to discover frequent patterns: (1) using internal data structures to store meta-patterns obtained by scanning the stream data; and (2) re-mining the meta-patterns to finalize and output frequent patterns. The defectiveness of such a two-phase framework lies in the fact that the two stages provide barriers to dynamically and immediately finding frequent patterns with online functionalities. It is expected that a single-phase algorithm can fulfil frequent pattern mining from data streams in such a way that the users can see patterns in an immediate and dynamic manner, as soon as the patterns have become frequent. In this paper, we propose INSTANT, a single-phase algorithm for discovering frequent itemsets from data streams. The theoretical foundation of INSTANT is based on a framework theory on a set of itemsets, which is also presented in the paper. The novel design of INSTANT ensures that it employs compact data structures to mine frequent patterns from data streams in a single phase. Our experimental results demonstrate the time and space efficiency of the proposed algorithm.