High utility pattern mining using the maximal itemset property and lexicographic tree structures

Authors:
Ming-Yen Lin;Tzer-Fu Tu;Sue-Chen Hsueh
Affiliations:
Dept. of Information Engineering and Computer Science, Feng Chia University, 100, Wenhua Road, Xitun, Taichung 407, Taiwan, ROC;Dept. of Information Engineering and Computer Science, Feng Chia University, 100, Wenhua Road, Xitun, Taichung 407, Taiwan, ROC;Dept. of Information Management, Chaoyang University of Technology, 168, Gifeng E. Road, Wufeng, Taichung 413, Taiwan, ROC
Venue:
Information Sciences: an International Journal
Year:
2012

Citing 20
Cited 3

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
A fast high utility itemsets mining algorithm

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Mining itemset utilities from transaction databases

Data & Knowledge Engineering - Special issue: ER 2003
High-utility pattern mining: A method for discovery of high-utility item sets

Pattern Recognition
CTU-Mine: An Efficient High Utility Itemset Mining Algorithm Using the Pattern Growth Approach

CIT '07 Proceedings of the 7th IEEE International Conference on Computer and Information Technology
Isolated items discarding strategy for discovering high utility itemsets

Data & Knowledge Engineering
An efficient algorithm for mining temporal high utility itemsets from data streams

Journal of Systems and Software
A bottom-up projection based algorithm for mining high utility itemsets

AIDM '07 Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining - Volume 84
Bottom-up discovery of frequent rooted unordered subtrees

Information Sciences: an International Journal
FIUT: A new method for mining frequent itemsets

Information Sciences: an International Journal
A Novel Algorithm for Mining High Utility Itemsets

ACIIDS '09 Proceedings of the 2009 First Asian Conference on Intelligent Information and Database Systems
An algorithm to mine general association rules from tabular data

Information Sciences: an International Journal
OPUS: an efficient admissible algorithm for unordered search

Journal of Artificial Intelligence Research
Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases

IEEE Transactions on Knowledge and Data Engineering
Efficient mining of high utility itemsets from large datasets

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Toward boosting distributed association rule mining by data de-clustering

Information Sciences: an International Journal
A two-phase algorithm for fast discovery of high utility itemsets

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Clustering local frequency items in multiple databases

Information Sciences: an International Journal
High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates

Expert Systems with Applications: An International Journal
Scaling up cosine interesting pattern discovery: A depth-first method

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

The problem of high utility mining is discovering all of the high utility itemsets in a transactional database. Most algorithms find high utility itemsets in two steps. The first step identifies all of the potential itemsets. The second step then determines the high utility itemsets from the set of potential itemsets. The large number of potential itemsets in the first step is generally the mining bottleneck. If we can reduce the number of potential itemsets, the mining performance can be improved significantly. In this paper, we use a maximal itemset property and propose an algorithm called UMMI (high Utility Mining using the Maximal Itemset property) to significantly reduce the number of potential itemsets in the first step. In the second step, UMMI uses an effective lexicographic tree structure to determine all of the high utility itemsets. In general, UMMI outperforms all three of the previously used algorithms, including CTU-PRO, an optimized TWU-mining algorithm, and Two-Phase, in our experiments using synthetic datasets. On average, UMMI is 5, 3, and 7 times faster than CTU-PRO, TWU-mining, and Two-Phase, respectively. In a real data experiment, UMMI is 6 times faster than Two-Phase. The other two algorithms are not capable of completing the mining step in a reasonable amount of time. UMMI uses an approximately fixed amount of memory, which is generally less than the other algorithms for each mining. The experimental results show that the proposed algorithm can mine the high utility itemsets efficiently. In addition, UMMI is linearly scalable with respect to the number of transactions.