Fast mining erasable itemsets using NC_sets

Authors:
Zhi-Hong Deng;Xiao-Ran Xu
Affiliations:
Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China;Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China
Venue:
Expert Systems with Applications: An International Journal
Year:
2012

Citing 23
Cited 1

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Accelerating XPath location steps

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Frequent term-based text clustering

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Top.K Frequent Closed Patterns without Minimum Support

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
CLOSET+: searching for the best strategies for mining frequent closed itemsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast vertical mining using diffsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
TFP: An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets

IEEE Transactions on Knowledge and Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm

IEEE Transactions on Knowledge and Data Engineering
High-utility pattern mining: A method for discovery of high-utility item sets

Pattern Recognition
Finding frequent items in probabilistic data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Direct mining of discriminative and essential frequent patterns via model-based search tree

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Quantitative evaluation of approximate frequent pattern mining algorithms

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
An improved data mining approach using predictive itemsets

Expert Systems with Applications: An International Journal
Frequent pattern mining with uncertain data

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic frequent itemset mining in uncertain databases

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Cartesian contour: a concise representation for a collection of frequent sets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
CP-summary: a concise representation for browsing frequent itemsets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Towards efficient mining of proportional fault-tolerant frequent itemsets

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns from network flows for monitoring network

Expert Systems with Applications: An International Journal

MEI: An efficient algorithm for mining erasable itemsets

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	12.05

Visualization

Abstract

Mining erasable itemsets first introduced in 2009 is one of new emerging data mining tasks. In this paper, we present a new data representation called NC_set, which keeps track of the complete information used for mining erasable itemsets. Based on NC_set, we propose a new algorithm called MERIT for mining erasable itemsets efficiently. The efficiency of MERIT is achieved with three techniques as follows. First, the NC_set is a compact structure, which prunes irrelevant data automatically. Second, the computation of the gain of an itemset is transformed into the combination of NC_sets, which can be completed in linear time complexity by an ingenious strategy. Third, MERIT can directly find erasable itemsets without generating candidate itemsets in some cases. For evaluating MERIT, we have conducted extensive experiments on a lot of synthetic product databases. Our performance study shows that the MERIT is efficient and is on average about two orders of magnitude faster than the META, the first algorithm for mining erasable itemsets.