Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining fuzzy association rules in databases
ACM SIGMOD Record
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Information Retrieval
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Density-based clustering of uncertain data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Towards correcting input data errors probabilistically using integrity constraints
MobiDE '06 Proceedings of the 5th ACM international workshop on Data engineering for wireless and mobile access
Sketching probabilistic data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Finding frequent items in probabilistic data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient Evaluation of Probabilistic Advanced Spatial Queries on Existentially Uncertain Data
IEEE Transactions on Knowledge and Data Engineering
A Survey of Uncertain Data Algorithms and Applications
IEEE Transactions on Knowledge and Data Engineering
Decision Trees for Uncertain Data
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Frequent pattern mining with uncertain data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic frequent itemset mining in uncertain databases
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
MayBMS: a probabilistic database management system
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Naive Bayes Classification of Uncertain Data
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Mining frequent itemsets from uncertain data
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Mining vague association rules
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Mining uncertain data with probabilistic guarantees
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast approximation of probabilistic frequent closed itemsets
Proceedings of the 50th Annual Southeast Regional Conference
Incremental update on probabilistic frequent itemsets in uncertain databases
Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
UFIMT: an uncertain frequent itemset mining toolbox
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent itemsets over uncertain databases
Proceedings of the VLDB Endowment
GPU acceleration of probabilistic frequent itemset mining from uncertain databases
Proceedings of the 21st ACM international conference on Information and knowledge management
Mining frequent serial episodes over uncertain sequence data
Proceedings of the 16th International Conference on Extending Database Technology
FARP: Mining fuzzy association rules from a probabilistic quantitative database
Information Sciences: an International Journal
Summarizing probabilistic frequent patterns: a fast approach
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining probabilistic generalized frequent itemsets in uncertain databases
Proceedings of the 51st ACM Southeast Conference
Discovering frequent itemsets on uncertain data: a systematic review
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.00 |
Data uncertainty is inherent in emerging applications such as location-based services, sensor monitoring systems, and data integration. To handle a large amount of imprecise information, uncertain databases have been recently developed. In this paper, we study how to efficiently discover frequent itemsets from large uncertain databases, interpreted under the Possible World Semantics. This is technically challenging, since an uncertain database induces an exponential number of possible worlds. To tackle this problem, we propose a novel method to capture the itemset mining process as a Poisson binomial distribution. This model-based approach extracts frequent itemsets with a high degree of accuracy, and supports large databases. We apply our techniques to improve the performance of the algorithms for: (1) finding itemsets whose frequentness probabilities are larger than some threshold; and (2) mining itemsets with the k highest frequentness probabilities. Our approaches support both tuple and attribute uncertainty models, which are commonly used to represent uncertain databases. Extensive evaluation on real and synthetic datasets shows that our methods are highly accurate. Moreover, they are orders of magnitudes faster than previous approaches.