Efficient itemset generator discovery over a stream sliding window

Authors:
Chuancong Gao;Jianyong Wang
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China
Venue:
Proceedings of the 18th ACM conference on Information and knowledge management
Year:
2009

Citing 21
Cited 2

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficient mining of emerging patterns: discovering trends and differences

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Making use of the most expressive jumping emerging patterns for classification

Knowledge and Information Systems
FOIL: A Midterm Report

ECML '93 Proceedings of the European Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Text Document Categorization by Term Association

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
CLOSET+: searching for the best strategies for mining frequent closed itemsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
On computing, storing and querying frequent patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining top-K covering rule groups for gene expression data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
CFI-Stream: mining closed frequent itemsets in data streams

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Catch the moment: maintaining closed frequent itemsets over a data stream sliding window

Knowledge and Information Systems
On Mining Instance-Centric Classification Rules

IEEE Transactions on Knowledge and Data Engineering
Lazy Associative Classification

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Mining statistically important equivalence classes and delta-discriminative emerging patterns

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
False positive or false negative: mining frequent itemsets from high speed transactional data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient mining of frequent sequence generators

Proceedings of the 17th international conference on World Wide Web
Direct Discriminative Pattern Mining for Effective Classification

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Minimum description length principle: generators are preferable to closed patterns

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
An incremental algorithm for mining generators representation

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

Direct mining of discriminative patterns for classifying uncertain data

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Key roles of closed sets and minimal generators in concise representations of frequent patterns

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining generator patterns has raised great research interest in recent years. The main purpose of mining itemset generators is that they can form equivalence classes together with closed itemsets, and can be used to generate simple classification rules according to the MDL principle. In this paper, we devise an efficient algorithm called StreamGen to mine frequent itemset generators over a stream sliding window. We adopt a novel enumeration tree structure to help keep the information of mined generators and the border between generators and non-generators, and propose some optimization techniques to speed up the mining process. We further extend the algorithm to directly mine a set of high quality classification rules over stream sliding windows while keeping high performance. The extensive performance study shows that our algorithm outperforms other state-of-the-art algorithms which perform similar tasks in terms of both runtime and memory usage efficiency, and has high utility in terms of classification.