Efficient algorithms for mining maximal high utility itemsets from data streams with different models

Authors:
Bai-En Shie;Philip S. Yu;Vincent S. Tseng
Affiliations:
Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan, ROC;Department of Computer Science, University of Illinois at Chicago, Chicago, IL, USA;Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan, ROC
Venue:
Expert Systems with Applications: An International Journal
Year:
2012

Citing 25
Cited 4

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficiently Mining Maximal Frequent Itemsets

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Issues in data stream management

ACM SIGMOD Record
Mining High Utility Itemsets

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Dynamically maintaining frequent items over a data stream

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Mining data streams: a review

ACM SIGMOD Record
A fast high utility itemsets mining algorithm

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Finding Maximal Frequent Itemsets over Online Data Streams Adaptively

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Research issues in data stream association rule mining

ACM SIGMOD Record
CFI-Stream: mining closed frequent itemsets in data streams

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Isolated items discarding strategy for discovering high utility itemsets

Data & Knowledge Engineering
A survey on algorithms for mining frequent itemsets over data streams

Knowledge and Information Systems
Maintaining frequent closed itemsets over a sliding window

Journal of Intelligent Information Systems
Incremental updates of closed frequent itemsets over continuous data streams

Expert Systems with Applications: An International Journal
Mining frequent closed itemsets from a landmark window over online data streams

Computers & Mathematics with Applications
Fast and Memory Efficient Mining of High Utility Itemsets in Data Streams

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
An Efficient Algorithm for Maintaining Frequent Closed Itemsets over Data Stream

IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases

IEEE Transactions on Knowledge and Data Engineering
Online mining of temporal maximal utility itemsets from data streams

Proceedings of the 2010 ACM Symposium on Applied Computing
UP-Growth: an efficient algorithm for high utility itemset mining

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent itemset mining of uncertain data streams using the damped window model

Proceedings of the 2011 ACM Symposium on Applied Computing
Mining frequent closed graphs on evolving data streams

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Sliding window based weighted maximal frequent pattern mining over data streams

Expert Systems with Applications: An International Journal
Mining maximal frequent patterns by considering weight conditions over data streams

Knowledge-Based Systems
High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates

Expert Systems with Applications: An International Journal
Efficient mining of maximal correlated weight frequent patterns

Intelligent Data Analysis

Quantified Score

Hi-index	12.05

Visualization

Abstract

Data stream mining is an emerging research topic in the data mining field. Finding frequent itemsets is one of the most important tasks in data stream mining with wide applications like online e-business and web click-stream analysis. However, two main problems existed in relevant studies: (1) The utilities (e.g., importance or profits) of items are not considered. Actual utilities of patterns cannot be reflected in frequent itemsets. (2) Existing utility mining methods produce too many patterns and this makes it difficult for the users to filter useful patterns among the huge set of patterns. In view of this, in this paper we propose a novel framework, named GUIDE (Generation of maximal high Utility Itemsets from Data strEams), to find maximal high utility itemsets from data streams with different models, i.e., landmark, sliding window and time fading models. The proposed structure, named MUI-Tree (Maximal high Utility Itemset Tree), maintains essential information for the mining processes and the proposed strategies further facilitates the performance of GUIDE. Main contributions of this paper are as follows: (1) To the best of our knowledge, this is the first work on mining the compact form of high utility patterns from data streams; (2) GUIDE is an effective one-pass framework which meets the requirements of data stream mining; (3) GUIDE generates novel patterns which are not only high utility but also maximal, which provide compact and insightful hidden information in the data streams. Experimental results show that our approach outperforms the state-of-the-art algorithms under various conditions in data stream environments on different models.