UFIMT: an uncertain frequent itemset mining toolbox
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent itemsets over uncertain databases
Proceedings of the VLDB Endowment
FARP: Mining fuzzy association rules from a probabilistic quantitative database
Information Sciences: an International Journal
Summarizing probabilistic frequent patterns: a fast approach
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Reducing uncertainty of schema matching via crowdsourcing
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
In recent years, many new applications, such as sensor network monitoring and moving object search, show a growing amount of importance of uncertain data management and mining. In this paper, we study the problem of discovering threshold-based frequent closed item sets over probabilistic data. Frequent item set mining over probabilistic database has attracted much attention recently. However, existing solutions may lead an exponential number of results due to the downward closure property over probabilistic data. Moreover, it is hard to directly extend the successful experiences from mining exact data to a probabilistic environment due to the inherent uncertainty of data. Thus, in order to obtain a reasonable result set with small size, we study discovering frequent closed item sets over probabilistic data. We prove that even a sub-problem of this problem, computing the frequent closed probability of an item set, is #P-Hard. Therefore, we develop an efficient mining algorithm based on depth-first search strategy to obtain all probabilistic frequent closed item sets. To reduce the search space and avoid redundant computation, we further design several probabilistic pruning and bounding techniques. Finally, we verify the effectiveness and efficiency of the proposed methods through extensive experiments.