Mining sequential patterns from probabilistic databases

Authors:
Muhammad Muzammal;Rajeev Raman
Affiliations:
Department of Computer Science, University of Leicester, UK;Department of Computer Science, University of Leicester, UK
Venue:
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Year:
2011

Citing 22
Cited 8

SPADE: an efficient algorithm for mining frequent sequences

Machine Learning
KDD-Cup 2000 organizers' report: peeling the onion

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Mining long sequential patterns in a noisy environment

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Sequential PAttern mining using a bitmap representation

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering all most specific sentences

ACM Transactions on Database Systems (TODS)
Introducing Uncertainty into Pattern Discovery in Temporal Event Sequences

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach

IEEE Transactions on Knowledge and Data Engineering
Foundations of probabilistic answers to queries

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Ranking queries on uncertain data: a probabilistic threshold approach

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Finding frequent items in probabilistic data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Managing and Mining Uncertain Data

Managing and Mining Uncertain Data
Probabilistic Event Extraction from RFID Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

The 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Frequent pattern mining with uncertain data

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic frequent itemset mining in uncertain databases

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Creating probabilistic databases from duplicated data

The VLDB Journal — The International Journal on Very Large Data Bases
Mining frequent itemsets from uncertain data

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A decremental approach for mining frequent itemsets from uncertain data

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
On probabilistic models for uncertain sequential pattern mining

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I

Mining sequential patterns from probabilistic databases by pattern-growth

BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Mining probabilistically frequent sequential patterns in uncertain databases

Proceedings of the 15th International Conference on Extending Database Technology
Community trend outlier detection using soft temporal pattern mining

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Mining frequent serial episodes over uncertain sequence data

Proceedings of the 16th International Conference on Extending Database Technology
Discovering frequent itemsets on uncertain data: a systematic review

MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Editorial: Pattern-growth based frequent serial episode discovery

Data & Knowledge Engineering
Mining order-preserving submatrices from probabilistic matrices

ACM Transactions on Database Systems (TODS)
Mining maximal frequent patterns by considering weight conditions over data streams

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider sequential pattern mining in situations where there is uncertainty about which source an event is associated with. We model this in the probabilistic database framework and consider the problem of enumerating all sequences whose expected support is sufficiently large. Unlike frequent itemset mining in probabilistic databases [C. Aggarwal et al. KDD'09; Chui et al., PAKDD'07; Chui and Kao, PAKDD'08], we use dynamic programming (DP) to compute the probability that a source supports a sequence, and show that this suffices to compute the expected support of a sequential pattern. Next, we embed this DP algorithm into candidate generate-and-test approaches, and explore the pattern lattice both in a breadth-first (similar to GSP) and a depth-first (similar to SPAM) manner. We propose optimizations for efficiently computing the frequent 1-sequences, for re-using previously-computed results through incremental support computation, and for elmiminating candidate sequences without computing their support via probabilistic pruning. Preliminary experiments show that our optimizations are effective in improving the CPU cost.