SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
KDD-Cup 2000 organizers' report: peeling the onion
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Sequential PAttern mining using a bitmap representation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering all most specific sentences
ACM Transactions on Database Systems (TODS)
Introducing Uncertainty into Pattern Discovery in Temporal Event Sequences
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
IEEE Transactions on Knowledge and Data Engineering
Foundations of probabilistic answers to queries
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Ranking queries on uncertain data: a probabilistic threshold approach
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Finding frequent items in probabilistic data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Managing and Mining Uncertain Data
Managing and Mining Uncertain Data
Probabilistic Event Extraction from RFID Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
The 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Frequent pattern mining with uncertain data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic frequent itemset mining in uncertain databases
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Creating probabilistic databases from duplicated data
The VLDB Journal — The International Journal on Very Large Data Bases
Mining frequent itemsets from uncertain data
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A decremental approach for mining frequent itemsets from uncertain data
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
On probabilistic models for uncertain sequential pattern mining
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Mining sequential patterns from probabilistic databases by pattern-growth
BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Mining probabilistically frequent sequential patterns in uncertain databases
Proceedings of the 15th International Conference on Extending Database Technology
Community trend outlier detection using soft temporal pattern mining
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Mining frequent serial episodes over uncertain sequence data
Proceedings of the 16th International Conference on Extending Database Technology
Discovering frequent itemsets on uncertain data: a systematic review
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Editorial: Pattern-growth based frequent serial episode discovery
Data & Knowledge Engineering
Mining order-preserving submatrices from probabilistic matrices
ACM Transactions on Database Systems (TODS)
Mining maximal frequent patterns by considering weight conditions over data streams
Knowledge-Based Systems
Hi-index | 0.00 |
We consider sequential pattern mining in situations where there is uncertainty about which source an event is associated with. We model this in the probabilistic database framework and consider the problem of enumerating all sequences whose expected support is sufficiently large. Unlike frequent itemset mining in probabilistic databases [C. Aggarwal et al. KDD'09; Chui et al., PAKDD'07; Chui and Kao, PAKDD'08], we use dynamic programming (DP) to compute the probability that a source supports a sequence, and show that this suffices to compute the expected support of a sequential pattern. Next, we embed this DP algorithm into candidate generate-and-test approaches, and explore the pattern lattice both in a breadth-first (similar to GSP) and a depth-first (similar to SPAM) manner. We propose optimizations for efficiently computing the frequent 1-sequences, for re-using previously-computed results through incremental support computation, and for elmiminating candidate sequences without computing their support via probabilistic pruning. Preliminary experiments show that our optimizations are effective in improving the CPU cost.