FreeSpan: frequent pattern-projected sequential pattern mining
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Introduction to algorithms
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
Trio: a system for data, uncertainty, and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Finding frequent items in probabilistic data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A Survey of Uncertain Data Algorithms and Applications
IEEE Transactions on Knowledge and Data Engineering
Probabilistic Event Extraction from RFID Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Frequent pattern mining with uncertain data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic frequent itemset mining in uncertain databases
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent itemsets from uncertain data
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Leveraging spatio-temporal redundancy for RFID data cleansing
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Probabilistic string similarity joins
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Mining uncertain data with probabilistic guarantees
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Set similarity join on probabilistic data
Proceedings of the VLDB Endowment
Clustering uncertain trajectories
Knowledge and Information Systems
Mining sequential patterns from probabilistic databases
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Mining frequent serial episodes over uncertain sequence data
Proceedings of the 16th International Conference on Extending Database Technology
Projection-based partial periodic pattern mining for event sequences
Expert Systems with Applications: An International Journal
Editorial: Pattern-growth based frequent serial episode discovery
Data & Knowledge Engineering
Mining order-preserving submatrices from probabilistic matrices
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
Data uncertainty is inherent in many real-world applications such as environmental surveillance and mobile tracking. As a result, mining sequential patterns from inaccurate data, such as sensor readings and GPS trajectories, is important for discovering hidden knowledge in such applications. Previous work uses expected support as the measurement of pattern frequentness, which has inherent weaknesses with respect to the underlying probability model, and is therefore ineffective for mining high-quality sequential patterns from uncertain sequence databases. In this paper, we propose to measure pattern frequentness based on the possible world semantics. We establish two uncertain sequence data models abstracted from many real-life applications involving uncertain sequence data, and formulate the problem of mining probabilistically frequent sequential patterns (or p-FSPs) from data that conform to our models. Based on the prefix-projection strategy of the famous PrefixSpan algorithm, we develop two new algorithms, collectively called U-PrefixSpan, for p-FSP mining. U-PrefixSpan effectively avoids the problem of "possible world explosion", and when combined with our three pruning techniques and one validating technique, achieves good performance. The efficiency and effectiveness of U-PrefixSpan are verified through extensive experiments on both real and synthetic datasets.