A fast high utility itemsets mining algorithm

Authors:
Ying Liu;Wei-keng Liao;Alok Choudhary
Affiliations:
Northwestern University, Evanston, IL;Northwestern University, Evanston, IL;Northwestern University, Evanston, IL
Venue:
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Year:
2005

Citing 15
Cited 43

An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A statistical theory for quantitative association rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient mining of weighted association rules (WAR)

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Parallel data mining for association rules on shared-memory multi-processors

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Discovering associations with numeric variables

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Parallel and Distributed Association Mining: A Survey

IEEE Concurrency
What Makes Patterns Interesting in Knowledge Discovery Systems

IEEE Transactions on Knowledge and Data Engineering
Extracting Share Frequent Itemsets with Infrequent Subsets

Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Association Rules with Weighted Items

IDEAS '98 Proceedings of the 1998 International Symposium on Database Engineering & Applications
Mining High Utility Itemsets

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Weighted Association Rule Mining using weighted support and significance framework

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining weighted association rules

Intelligent Data Analysis

Matrix apriori: speeding up the search for frequent patterns

DBA'06 Proceedings of the 24th IASTED international conference on Database and applications
Isolated items discarding strategy for discovering high utility itemsets

Data & Knowledge Engineering
Efficient algorithms for incremental utility mining

Proceedings of the 2nd international conference on Ubiquitous information management and communication
Mining long high utility itemsets in transaction databases

SMO'07 Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization
Mining high utility itemsets in large high dimensional data

Proceedings of the 1st international conference on Forensic applications and techniques in telecommunications, information, and multimedia and workshop
An efficient algorithm for mining temporal high utility itemsets from data streams

Journal of Systems and Software
A bottom-up projection based algorithm for mining high utility itemsets

AIDM '07 Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining - Volume 84
Pushing Frequency Constraint to Utility Mining Model

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Mining long high utility itemsets in transaction databases

WSEAS Transactions on Information Science and Applications
Handling Dynamic Weights in Weighted Frequent Pattern Mining

IEICE - Transactions on Information and Systems
Parallel Method for Mining High Utility Itemsets from Vertically Partitioned Distributed Databases

KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part I
Efficient mining of utility-based web path traversal patterns

ICACT'09 Proceedings of the 11th international conference on Advanced Communication Technology - Volume 3
Mining high average-utility itemsets

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Online mining of temporal maximal utility itemsets from data streams

Proceedings of the 2010 ACM Symposium on Applied Computing
Efficient mining of high utility itemsets from large datasets

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
An efficient approach for mining web content sensitivity

International Journal of Knowledge and Web Intelligence
UP-Growth: an efficient algorithm for high utility itemset mining

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently mining high average utility itemsets with a tree structure

ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part I
A three-scan algorithm to mine high on-shelf utility itemsets

ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II
Discovery of high utility itemsets from on-shelf time periods of products

Expert Systems with Applications: An International Journal
An effective tree structure for mining high utility itemsets

Expert Systems with Applications: An International Journal
Effective utility mining with the measure of average utility

Expert Systems with Applications: An International Journal
RMS-TM: a comprehensive benchmark suite for transactional memory systems

Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
An efficient strategy for mining high utility itemsets

International Journal of Intelligent Information and Database Systems
HUC-Prune: an efficient candidate pruning technique to mine high utility patterns

Applied Intelligence
Mining high utility mobile sequential patterns in mobile commerce environments

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
MHUI-max: An efficient algorithm for discovering high-utility itemsets from data streams

Journal of Information Science
An incremental mining algorithm for high utility itemsets

Expert Systems with Applications: An International Journal
Discovering valuable user behavior patterns in mobile commerce environments

PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
Mining top-K high utility itemsets

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
High utility pattern mining using the maximal itemset property and lexicographic tree structures

Information Sciences: an International Journal
Efficient algorithms for mining maximal high utility itemsets from data streams with different models

Expert Systems with Applications: An International Journal
A one-phase method for mining high utility mobile sequential patterns in mobile commerce environments

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Mining high utility quantitative association rules

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Mining high utility itemsets without candidate generation

Proceedings of the 21st ACM international conference on Information and knowledge management
Utility-based association rule mining: A marketing solution for cross-selling

Expert Systems with Applications: An International Journal
A tree-based approach for mining frequent weighted utility itemsets

ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part I
Mining interesting user behavior patterns in mobile commerce environments

Applied Intelligence
Mining high utility episodes in complex event sequences

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
On-shelf utility mining with negative item values

Expert Systems with Applications: An International Journal
Incrementally mining high utility patterns based on pre-large concept

Applied Intelligence
A new utility-emphasized analysis for stock trading rules

Intelligent Data Analysis
Incorporating frequency, recency and profit in sequential pattern based recommender systems

Intelligent Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

Association rule mining (ARM) identifies frequent itemsets from databases and generates association rules by considering each item in equal value. However, items are actually different in many aspects in a number of real applications, such as retail marketing, network log, etc. The difference between items makes a strong impact on the decision making in these applications. Therefore, traditional ARM cannot meet the demands arising from these applications. By considering the different values of individual items as utilities, utility mining focuses on identifying the itemsets with high utilities. As "downward closure property" doesn't apply to utility mining, the generation of candidate itemsets is the most costly in terms of time and memory space. In this paper, we present a Two-Phase algorithm to efficiently prune down the number of candidates and can precisely obtain the complete set of high utility itemsets. In the first phase, we propose a model that applies the "transaction-weighted downward closure property" on the search space to expedite the identification of candidates. In the second phase, one extra database scan is performed to identify the high utility itemsets. We also parallelize our algorithm on shared memory multi-process architecture using Common Count Partitioned Database (CCPD) strategy. We verify our algorithm by applying it to both synthetic and real databases. It performs very efficiently in terms of speed and memory cost, and shows good scalability on multiple processors, even on large databases that are difficult for existing algorithms to handle.