Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Scalable Algorithms for Association Mining
IEEE Transactions on Knowledge and Data Engineering
H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Querying Imprecise Data in Moving Object Environments
IEEE Transactions on Knowledge and Data Engineering
Robust and fast similarity search for moving object trajectories
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
Probability and Computing: Randomized Algorithms and Probabilistic Analysis
On the marriage of Lp-norms and edit distance
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Finding frequent items in probabilistic data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Underground coal mine monitoring with wireless sensor networks
ACM Transactions on Sensor Networks (TOSN)
Managing and Mining Uncertain Data
Managing and Mining Uncertain Data
A Survey of Uncertain Data Algorithms and Applications
IEEE Transactions on Knowledge and Data Engineering
Frequent pattern mining with uncertain data
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic frequent itemset mining in uncertain databases
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Canopy closure estimates with GreenOrbs: sustainable sensing in the forest
Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems
Mining frequent itemsets from uncertain data
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A decremental approach for mining frequent itemsets from uncertain data
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
A tree-based approach for frequent pattern mining from uncertain data
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Mining uncertain data with probabilistic guarantees
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering Uncertain Data Using Voronoi Diagrams and R-Tree Index
IEEE Transactions on Knowledge and Data Engineering
Accelerating probabilistic frequent itemset mining: a model-based approach
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Approximation of Frequentness Probability of Itemsets in Uncertain Data
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Passive diagnosis for wireless sensor networks
IEEE/ACM Transactions on Networking (TON)
Outlier detection on uncertain data: Objects, instances, and inferences
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Efficient pattern mining of uncertain data with sampling
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
Discovering Threshold-based Frequent Closed Itemsets over Probabilistic Data
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
UFIMT: an uncertain frequent itemset mining toolbox
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent serial episodes over uncertain sequence data
Proceedings of the 16th International Conference on Extending Database Technology
FARP: Mining fuzzy association rules from a probabilistic quantitative database
Information Sciences: an International Journal
Summarizing probabilistic frequent patterns: a fast approach
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Reducing uncertainty of schema matching via crowdsourcing
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
In recent years, due to the wide applications of uncertain data, mining frequent itemsets over uncertain databases has attracted much attention. In uncertain databases, the support of an itemset is a random variable instead of a fixed occurrence counting of this itemset. Thus, unlike the corresponding problem in deterministic databases where the frequent itemset has a unique definition, the frequent itemset under uncertain environments has two different definitions so far. The first definition, referred as the expected support-based frequent itemset, employs the expectation of the support of an itemset to measure whether this itemset is frequent. The second definition, referred as the probabilistic frequent itemset, uses the probability of the support of an itemset to measure its frequency. Thus, existing work on mining frequent itemsets over uncertain databases is divided into two different groups and no study is conducted to comprehensively compare the two different definitions. In addition, since no uniform experimental platform exists, current solutions for the same definition even generate inconsistent results. In this paper, we firstly aim to clarify the relationship between the two different definitions. Through extensive experiments, we verify that the two definitions have a tight connection and can be unified together when the size of data is large enough. Secondly, we provide baseline implementations of eight existing representative algorithms and test their performances with uniform measures fairly. Finally, according to the fair tests over many different benchmark data sets, we clarify several existing inconsistent conclusions and discuss some new findings.