KDD-Cup 2000 organizers' report: peeling the onion
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
Approximating a collection of frequent sets
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining compressed frequent-pattern sets
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient mining of iterative patterns for software specification discovery
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A scalable algorithm for mining maximal frequent sequences using a sample
Knowledge and Information Systems
Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Hi-index | 0.00 |
Mining frequent sequential patterns from sequence databases has been a central research topic in data mining and various efficient mining sequential patterns algorithms have been proposed and studied. Recently, a novel sequential pattern mining research, called mining repetitive gapped subsequences, has attracted the attention of many researchers. However, the number of repetitive gapped subsequences generated by even these closed mining algorithms may be too large to understand for users, especially when support threshold is low. In this paper, we propose the problem of how to compress repetitive gapped sequential patterns. A novel distance measure of repetitive gapped sequential patterns and an efficient representative pattern checking scheme, *** -dominate sequential pattern checking are proposed. We also develop an efficient algorithm, CRGSgrow ( C ompressing R epetitive G apped S equential pattern grow ), including an efficient pruning strategy, SyncScan. An empirical study with both real and synthetic data sets clearly shows that the CRGSgrow has good performance.