USpan: an efficient algorithm for mining high utility sequential patterns

Authors:
Junfu Yin;Zhigang Zheng;Longbing Cao
Affiliations:
University of Technology, Sydney, Sydney, Australia;University of Technology, Sydney, Sydney, Australia;University of Technology, Sydney, Sydney, Australia
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 13
Cited 3

SPADE: an efficient algorithm for mining frequent sequences

Machine Learning
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Sequential PAttern mining using a bitmap representation

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining itemset utilities from transaction databases

Data & Knowledge Engineering - Special issue: ER 2003
Isolated items discarding strategy for discovering high utility itemsets

Data & Knowledge Engineering
Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases

IEEE Transactions on Knowledge and Data Engineering
Domain Driven Data Mining

Domain Driven Data Mining
A taxonomy of sequential pattern mining algorithms

ACM Computing Surveys (CSUR)
UP-Growth: an efficient algorithm for high utility itemset mining

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining High Utility Web Access Sequences in Dynamic Web Log Data

SNPD '10 Proceedings of the 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing
Mining high utility mobile sequential patterns in mobile commerce environments

DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
A two-phase algorithm for fast discovery of high utility itemsets

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Mining high utility episodes in complex event sequences

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently rewriting large multimedia application execution traces with few event sequences

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential pattern mining plays an important role in many applications, such as bioinformatics and consumer behavior analysis. However, the classic frequency-based framework often leads to many patterns being identified, most of which are not informative enough for business decision-making. In frequent pattern mining, a recent effort has been to incorporate utility into the pattern selection framework, so that high utility (frequent or infrequent) patterns are mined which address typical business concerns such as dollar value associated with each pattern. In this paper, we incorporate utility into sequential pattern mining, and a generic framework for high utility sequence mining is defined. An efficient algorithm, USpan, is presented to mine for high utility sequential patterns. In USpan, we introduce the lexicographic quantitative sequence tree to extract the complete set of high utility sequences and design concatenation mechanisms for calculating the utility of a node and its children with two effective pruning strategies. Substantial experiments on both synthetic and real datasets show that USpan efficiently identifies high utility sequences from large scale data with very low minimum utility.