SPADE: An Efficient Algorithm for Mining Frequent Sequences

Authors:
Mohammed J. Zaki
Affiliations:
Computer Science Department, Rensselaer Polytechnic Institute, Troy NY 12180-3590. zaki@cs.rpi.edu
Venue:
Machine Learning
Year:
2001

Citing 8
Cited 18

Fast discovery of association rules

Advances in knowledge discovery and data mining
Efficient enumeration of frequent sequences

Proceedings of the seventh international conference on Information and knowledge management
TRIPs: an integrated intelligent problem-solving assistant

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Improving big plans

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Knowledge Discovery from Telecommunication Network Alarm Databases

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases

Mining sequential patterns with constraints in large databases

Proceedings of the eleventh international conference on Information and knowledge management
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Sequential PAttern mining using a bitmap representation

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Improving spatial locality of programs via data mining

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Temporal web log mining using olap techniques

ICCMSE '03 Proceedings of the international conference on Computational methods in sciences and engineering
A new algorithm for gap constrained sequence mining

Proceedings of the 2004 ACM symposium on Applied computing
Visual web mining

Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Parallel tree-projection-based sequence mining algorithms

Parallel Computing
Mining, indexing, and querying historical spatiotemporal data

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The complexity of mining maximal frequent itemsets and maximal frequent patterns

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
IncSpan: incremental mining of sequential patterns in large database

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Chopper: efficient algorithm for tree mining

Journal of Computer Science and Technology
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach

IEEE Transactions on Knowledge and Data Engineering
Scalable sequential pattern mining for biological sequences

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Mining block correlations to improve storage performance

ACM Transactions on Storage (TOS)
CP-Miner: a tool for finding copy-paste and related bugs in operating system code

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Incremental and interactive mining of web traversal patterns

Information Sciences: an International Journal
Mining sequential patterns in the B2B environment

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present SPADE, a new algorithm for fast discovery of Sequential Patterns. The existing solutions to this problem make repeated database scans, and use complex hash structures which have poor locality. SPADE utilizes combinatorial properties to decompose the original problem into smaller sub-problems, that can be independently solved in main-memory using efficient lattice search techniques, and using simple join operations. All sequences are discovered in only three database scans. Experiments show that SPADE outperforms the best previous algorithm by a factor of two, and by an order of magnitude with some pre-processed data. It also has linear scalability with respect to the number of input-sequences, and a number of other database parameters. Finally, we discuss how the results of sequence mining can be applied in a real application domain.