Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
A Tight Upper Bound on the Number of Candidate Patterns
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
The PSP Approach for Mining Sequential Patterns
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Feasible itemset distributions in data mining: theory and application
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sequential PAttern mining using a bitmap representation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
GIMS - A Data Warehouse for Storage and Analysis of Genome Sequence and Functional Data
BIBE '01 Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering
132-avoiding two-stack sortable permutations, Fibonacci numbers, and Pell numbers
Discrete Applied Mathematics
TSP: Mining top-k closed sequential patterns
Knowledge and Information Systems
Distribution-Based Synthetic Database Generation Techniques for Itemset Mining
IDEAS '05 Proceedings of the 9th International Database Engineering & Application Symposium
Warehousing and Analyzing Massive RFID Data Sets
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Generatingfunctionology
Constraint-based sequential pattern mining: the pattern-growth methods
Journal of Intelligent Information Systems
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Analytic Combinatorics
OLAP on search logs: an infrastructure supporting data-driven applications in search engines
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Sequence Data Mining
Towards generic pattern mining
ICFCA'05 Proceedings of the Third international conference on Formal Concept Analysis
Frequent patterns mining in multiple biological sequences
Computers in Biology and Medicine
Hi-index | 0.00 |
Given a sequence database, can we have a non-trivial upper bound on the number of sequential patterns? The problem of bounding sequential patterns is very challenging in theory due to the combinatorial complexity of sequences, even given some inspiring results on bounding itemsets in frequent itemset mining. Moreover, the problem is highly meaningful in practice, since the upper bound can be used in many applications such as space allocation in building sequence data warehouses. In this paper, we tackle the problem of bounding sequential patterns by presenting, for the first time in the field of sequential pattern mining, strong combinatorial results on computing the number of possible sequential patterns that can be generated at a given length k. We introduce, as a case study, two novel techniques to estimate the number of candidate sequences. An extensive empirical study on both real data and synthetic data verifies the effectiveness of our methods.