A false negative maximal frequent itemset mining algorithm over stream

Authors:
Haifeng Li;Ning Zhang
Affiliations:
School of Information, Central University of Finance and Economics, Beijing, China;School of Information, Central University of Finance and Economics, Beijing, China
Venue:
ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
Year:
2011

Citing 33
Cited 0

Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Depth first generation of long patterns

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases

Proceedings of the 17th International Conference on Data Engineering
Efficiently Mining Maximal Frequent Itemsets

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
What's hot and what's not: tracking most frequent items dynamically

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Online Algorithms for Mining Semi-structured Data Stream

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
Finding recent frequent itemsets adaptively over online data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Weighted Association Rule Mining using weighted support and significance framework

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Dynamically maintaining frequent items over a data stream

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Approximating a collection of frequent sets

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The complexity of mining maximal frequent itemsets and maximal frequent patterns

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Online Mining (Recently) Maximal Frequent Itemsets over Data Streams

RIDE '05 Proceedings of the 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications
An Algorithm for In-Core Frequent Itemset Mining on Streaming Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Finding Maximal Frequent Itemsets over Online Data Streams Adaptively

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
On Characterization and Discovery of Minimal Unexpected Patterns in Rule Discovery

IEEE Transactions on Knowledge and Data Engineering
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Mining maximal frequent itemsets from data streams

Journal of Information Science
Frequent pattern mining: current status and future directions

Data Mining and Knowledge Discovery
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A regression-based temporal pattern mining scheme for data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Finding hierarchical heavy hitters in data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
False positive or false negative: mining frequent itemsets from high speed transactional data streams

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A survey on algorithms for mining frequent itemsets over data streams

Knowledge and Information Systems
Mining Frequent Itemsets in a Stream

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
estMax: Tracing Maximal Frequent Itemsets over Online Data Streams

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Verifying and Mining Frequent Patterns from Large Windows over Data Streams

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
An approximate approach for mining recently frequent itemsets from data streams

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Maximal frequent itemsets are one of several condensed representations of frequent itemsets, which store most of the information contained in frequent itemsets using less space, thus being more suitable for stream mining. This paper focuses on mining maximal frequent itemsets approximately over a stream landmark model. We separate the continuously arriving transactions into sections and maintain them with 3-tuple lists indexed by an extended direct update tree; thus, an efficient algorithm named FNMFIMoDS is proposed. In our algorithm, we employ the Chernoff Bound to perform the maximal frequent itemset mining in a false negative manner; plus, we classify the itemsets into categories and prune some redundant itemsets, which can further reduce the memory cost, as well guarantee our algorithm conducting with an incremental fashion. Our experimental results on two synthetic datasets and two real world datasets show that with a high precision, FNMFIMoDS achieves a faster speed and a much reduced memory cost in comparison with the state-of-the-art algorithm.