From sequential pattern mining to structured pattern mining: a pattern-growth approach

Authors:
Jia-Wei Han;Jian Pei;Xi-Feng Yan
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL;State University of New York at Buffalo, Buffalo, NY;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
Journal of Computer Science and Technology
Year:
2004

Citing 30
Cited 1

Combinatorial pattern discovery for scientific data: some preliminary results

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Exploratory mining and pruning optimizations of constrained associations rules

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient enumeration of frequent sequences

Proceedings of the seventh international conference on Information and knowledge management
Bottom-up computation of sparse and Iceberg CUBE

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
FreeSpan: frequent pattern-projected sequential pattern mining

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SPADE: an efficient algorithm for mining frequent sequences

Machine Learning
Efficient computation of Iceberg cubes with complex measures

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
KDD-Cup 2000 organizers' report: peeling the onion

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Multi-dimensional sequential pattern mining

Proceedings of the tenth international conference on Information and knowledge management
Introduction to Algorithms

Introduction to Algorithms
Mining sequential patterns with constraints in large databases

Proceedings of the eleventh international conference on Information and knowledge management
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Cyclic Association Rules

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Mining Frequent Item Sets with Convertible Constraints

Proceedings of the 17th International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
On Dual Mining: From Patterns to Circumstances, and Back

Proceedings of the 17th International Conference on Data Engineering
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
The PSP Approach for Mining Sequential Patterns

PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
On the Discovery of Interesting Patterns in Association Rules

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Computing Frequent Graph Patterns from Semistructured Data

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Mining of Partial Periodic Patterns in Time Series Database

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Constraint-Based Rule Mining in Large, Dense Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
ROCK: A Robust Clustering Algorithm for Categorical Attributes

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Frequent patterns mining in multiple biological sequences

Computers in Biology and Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential pattern mining is an important data mining problem with broad applications. However, it, is also a challenging problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Recent studies have developed two major classes of sequential pattern mining methods: (1) a candidate generation-and-test approach, represented by (i) GSP, a horizontal format-based sequential pattern mining method, and (ii) SPADE, a vertical format-based method; and (2) a pattern-growth method, represented by PrefixSpan and its further extensions, such as gSpan for mining structured patterns.In this study, we perform a systematic introduction and presentation of the pattern-growth methodology and study its principles and extensions. We first introduce two interesting pattern-growth algorithms, FreeSpan mid PrefixSpan, for efficient sequential pattern mining. Then we introduce gSpan for mining structured patterns using the same methodology. Their relative performance in large databases is presented and analyzed. Several extensions of these methods are also discussed in the paper, including mining multi-level, multi-dimensional patterns and mining constraint-based patterns.