Computational Statistics & Data Analysis
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
An overview of query optimization in relational systems
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Using association rules for product assortment decisions: a case study
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Discovering all most specific sentences
ACM Transactions on Database Systems (TODS)
Fully automatic cross-associations
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The complexity of mining maximal frequent itemsets and maximal frequent patterns
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Tight upper bounds on the number of candidate patterns
ACM Transactions on Database Systems (TODS)
Average Number of Frequent (Closed) Patterns in Bernouilli and Markovian Databases
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining
Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
Computing frequent itemsets inside oracle 10G
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Approximating the number of frequent sets in dense data
Knowledge and Information Systems
Power-law based estimation of set similarity join size
Proceedings of the VLDB Endowment
Fastest association rule mining algorithm predictor (FARM-AP)
Proceedings of The Fourth International C* Conference on Computer Science and Software Engineering
Hi-index | 0.01 |
Estimating the number of frequent itemsets for minimal support α in a large dataset is of great interest from both theoretical and practical perspectives. However, finding not only the number of frequent itemsets, but even the number of maximal frequent itemsets, is #P-complete. In this study, we provide a theoretical investigation on the sampling estimator. We discover and prove several fundamental but also rather surprising properties of the sampling estimator. We also propose a novel algorithm to estimate the number of frequent itemsets without using sampling. Our detailed experimental results have shown the accuracy and efficiency of our proposed approach.