Mining top-K high utility itemsets

Authors:
Cheng Wei Wu;Bai-En Shie;Vincent S. Tseng;Philip S. Yu
Affiliations:
Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, ROC, Tainan, Taiwan Roc;Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, ROC, Tainan, Taiwan Roc;Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, ROC, Tainan, Taiwan Roc;Department of Computer Science, University of Illinois at Chicago, Chicago, Illinois, USA, Chicago, Illinois, USA
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 20
Cited 3

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining N-most Interesting Itemsets

ISMIS '00 Proceedings of the 12th International Symposium on Foundations of Intelligent Systems
Mining Top.K Frequent Closed Patterns without Minimum Support

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining High Utility Itemsets

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining Frequent Itemsets without Support Threshold: With and without Item Constraints

IEEE Transactions on Knowledge and Data Engineering
TFP: An Efficient Algorithm for Mining Top-K Frequent Closed Itemsets

IEEE Transactions on Knowledge and Data Engineering
A fast high utility itemsets mining algorithm

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Isolated items discarding strategy for discovering high utility itemsets

Data & Knowledge Engineering
Efficient algorithms for incremental utility mining

Proceedings of the 2nd international conference on Ubiquitous information management and communication
Mining N-most interesting itemsets without support threshold by the COFI-tree

International Journal of Business Intelligence and Data Mining
Mining top-k frequent patterns in the presence of the memory constraint

The VLDB Journal — The International Journal on Very Large Data Bases
Fast and Memory Efficient Mining of High Utility Itemsets in Data Streams

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases

IEEE Transactions on Knowledge and Data Engineering
Parallel Method for Mining High Utility Itemsets from Vertically Partitioned Distributed Databases

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
Online mining of temporal maximal utility itemsets from data streams

Proceedings of the 2010 ACM Symposium on Applied Computing
Efficient mining of high utility itemsets from large datasets

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
UP-Growth: an efficient algorithm for high utility itemset mining

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
ExMiner: an efficient algorithm for mining top-k frequent patterns

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Mining high utility quantitative association rules

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery

Mining high utility episodes in complex event sequences

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently rewriting large multimedia application execution traces with few event sequences

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining high utility itemsets from databases is an emerging topic in data mining, which refers to the discovery of itemsets with utilities higher than a user-specified minimum utility threshold min_util. Although several studies have been carried out on this topic, setting an appropriate minimum utility threshold is a difficult problem for users. If min_util is set too low, too many high utility itemsets will be generated, which may cause the mining algorithms to become inefficient or even run out of memory. On the other hand, if min_util is set too high, no high utility itemset will be found. Setting appropriate minimum utility thresholds by trial and error is a tedious process for users. In this paper, we address this problem by proposing a new framework named top-k high utility itemset mining, where k is the desired number of high utility itemsets to be mined. An efficient algorithm named TKU (Top-K Utility itemsets mining) is proposed for mining such itemsets without setting min_util. Several features were designed in TKU to solve the new challenges raised in this problem, like the absence of anti-monotone property and the requirement of lossless results. Moreover, TKU incorporates several novel strategies for pruning the search space to achieve high efficiency. Results on real and synthetic datasets show that TKU has excellent performance and scalability.