Mining long high utility itemsets in transaction databases

Authors:
Guangzhu Yu;Shihuang Shao;Xianhui Zeng
Affiliations:
Information and Technology College, DongHua University, Shanghai, China;Information and Technology College, DongHua University, Shanghai, China;Information and Technology College, DongHua University, Shanghai, China
Venue:
WSEAS Transactions on Information Science and Applications
Year:
2008

Citing 18
Cited 3

Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Turbo-charging vertical mining of large databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A tree projection algorithm for generation of frequent item sets

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Discovering associations with numeric variables

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting Share Frequent Itemsets with Infrequent Subsets

Data Mining and Knowledge Discovery
A Statistical Theory for Quantitative Association Rules

Journal of Intelligent Information Systems
Profit Mining: From Patterns to Actions

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Value Added Association Rules

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Objective-Oriented Utility-Based Association Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Association Rules with Weighted Items

IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
Mining High Utility Itemsets

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
Fast vertical mining using diffsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
COBBLER: Combining Column and Row Enumeration for Closed Pattern Discovery

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
A fast high utility itemsets mining algorithm

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Mining itemset utilities from transaction databases

Data & Knowledge Engineering - Special issue: ER 2003
Mining weighted association rules

Intelligent Data Analysis

Comparative genome sequence analysis by efficient pattern matching technique

WSEAS Transactions on Information Science and Applications
Parallel Method for Mining High Utility Itemsets from Vertically Partitioned Distributed Databases

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
An efficient strategy for mining high utility itemsets

International Journal of Intelligent Information and Database Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Existing algorithms for utility mining are column enumeration based, adopt an Apriori-like candidate set generation-and-test approach, and thus are inadequate on datasets with high dimensions or long patterns. To solve the problem, this paper proposes a hybrid model and a row enumeration based algorithm, i.e., inter-transaction, to discover high utility itemsets from two directions: existing algorithms such as UMining [1] can be used to seek short high utility itemsets from the bottom, while inter-transaction seeks long high utility itemsets from the top. By intersecting relevant transactions, the new algorithm can identify long high utility itemsets directly, without extending short itemsets step by step. In addition, new pruning strategies are used to cut down search space; optimization technique is adopted to improve the performance of the intersection of transactions. Experiments on synthetic data show that our method achieves high performance, especially in large high dimensional datasets.