Efficient Mining of a Concise and Lossless Representation of High Utility Itemsets

Authors:
Cheng Wei Wu;Philippe Fournier-Viger;Philip S. Yu;Vincent S. Tseng
Affiliations:
-;-;-;-
Venue:
ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
Year:
2011

Citing 0
Cited 3

Mining high utility itemsets without candidate generation

Proceedings of the 21st ACM international conference on Information and knowledge management
Mining high utility episodes in complex event sequences

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining high utility item sets from transactional databases is an important data mining task, which refers to the discovery of item sets with high utilities (e.g. high profits). Although several studies have been carried out, current methods may present too many high utility item sets for users, which degrades the performance of the mining task in terms of execution and memory efficiency. To achieve high efficiency for the mining task and provide a concise mining result to users, we propose a novel framework in this paper for mining closed+ high utility item sets, which serves as a compact and loss less representation of high utility item sets. We present an efficient algorithm called CHUD (Closed+ High Utility item set Discovery) for mining closed+ high utility item sets. Further, a method called DAHU (Derive All High Utility item sets) is proposed to recover all high utility item sets from the set of closed+ high utility item sets without accessing the original database. Results of experiments on real and synthetic datasets show that CHUD and DAHU are very efficient with a massive reduction (up to 800 times in our experiments) in the number of high utility item sets. In addition, when all high utility item sets are recovered by DAHU, the approach combining CHUD and DAHU also outperforms the state-of-the-art algorithms in mining high utility item sets.