Finding Frequent Closed Itemsets in Sliding Window in Linear Time

Authors:
Junbo Chen;Bo Zhou;Lu Chen;Xinyu Wang;Yiqun Ding
Affiliations:
-;-;-;-;-
Venue:
IEICE - Transactions on Information and Systems
Year:
2008

Citing 9
Cited 1

Computing iceberg concept lattices with TITANIC

Data & Knowledge Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Adaptive and Resource-Aware Mining of Frequent Sets

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
CLOSET+: searching for the best strategies for mining frequent closed itemsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast and Memory Efficient Mining of Frequent Closed Itemsets

IEEE Transactions on Knowledge and Data Engineering
Frequent closed itemset based algorithms: a thorough structural and analytical survey

ACM SIGKDD Explorations Newsletter
CFI-Stream: mining closed frequent itemsets in data streams

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
GC-tree: a fast online algorithm for mining frequent closed itemsets

PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
TGC-tree: an online algorithm tracing closed itemset and transaction set simultaneously

LKR'08 Proceedings of the 3rd international conference on Large-scale knowledge resources: construction and application

Interactive mining of high utility patterns over data streams

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining closed itemsets has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+ [6], DCI-Closed [4], FCI-Stream [3], GC-Tree [15], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.