Incremental update on probabilistic frequent itemsets in uncertain databases

Authors:
Ming-Yen Lin;Cheng-Tai Fu;Sue-Chen Hsueh
Affiliations:
Feng Chia University, Taichung, Taiwan;Feng Chia University, Taichung, Taiwan;Chaoyang University of Technology, Taichung, Taiwan
Venue:
Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
Year:
2012

Citing 21
Cited 1

An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Real world performance of association rule algorithms

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A General Incremental Technique for Maintaining Discovered Association Rules

Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA)
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
Towards correcting input data errors probabilistically using integrity constraints

MobiDE '06 Proceedings of the 5th ACM international workshop on Data engineering for wireless and mobile access
Sketching probabilistic data streams

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient query evaluation on probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Finding frequent items in probabilistic data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A Survey of Uncertain Data Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
Frequent pattern mining with uncertain data

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic frequent itemset mining in uncertain databases

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
MayBMS: a probabilistic database management system

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Mining frequent itemsets from uncertain data

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A decremental approach for mining frequent itemsets from uncertain data

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
A tree-based approach for frequent pattern mining from uncertain data

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Mining uncertain data with probabilistic guarantees

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Accelerating probabilistic frequent itemset mining: a model-based approach

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Mining probabilistic frequent closed itemsets in uncertain databases

Proceedings of the 49th Annual Southeast Regional Conference

FARP: Mining fuzzy association rules from a probabilistic quantitative database

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining frequent itemsets in an uncertain database is a highly complicated problem. Most algorithms focus on improving the mining efficiency with the assumption that the database is static. Uncertain databases, however, are constantly updated with newly appended transactions like certain databases. Some patterns may become obsolete and new ones may emerge due to updates. Remining the whole uncertain database from scratch is very time-consuming owing to the frequentness probabilities computations. To tackle this maintenance problem, we propose an algorithm called p-FUP for efficient incremental update of patterns in an uncertain database. The p-FUP algorithm, inspired by a threshold-based PFI-testing technique and the FUP algorithm, uses approximations to incrementally update and discovers frequent itemsets in the uncertain database. Comprehensive experiments using both real and synthetic datasets show that p-FUP outperforms the re-mining based algorithm of 2.8 times faster in average, and has good linear scalability.