Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
Data mining, hypergraph transversals, and machine learning (extended abstract)
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Depth first generation of long patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
Real world performance of association rule algorithms
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Pincer Search: A New Algorithm for Discovering the Maximum Frequent Set
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
A Tight Upper Bound on the Number of Candidate Patterns
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Efficiently Mining Maximal Frequent Itemsets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Efficient Algorithm for Mining Association Rules in Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Statistical properties of transactional databases
Proceedings of the 2004 ACM symposium on Applied computing
Support envelopes: a technique for exploring the structure of association patterns
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
The complexity of mining maximal frequent itemsets and maximal frequent patterns
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
A Thorough Experimental Study of Datasets for Frequent Itemsets
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Approximate Inverse Frequent Itemset Mining: Privacy, Complexity, and Approximation
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Computational aspects of mining maximal frequent patterns
Theoretical Computer Science
The VLDB Journal — The International Journal on Very Large Data Bases
An audit environment for outsourcing of frequent itemset mining
Proceedings of the VLDB Endowment
A new classification of datasets for frequent itemsets
Journal of Intelligent Information Systems
Towards bounding sequential patterns
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Discovery of probabilistic mappings between taxonomies: principles and experiments
Journal on data semantics XV
On exploring the power-law relationship in the itemset support distribution
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
A further study on inverse frequent set mining
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Transaction databases, frequent itemsets, and their condensed representations
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
FoIKS'12 Proceedings of the 7th international conference on Foundations of Information and Knowledge Systems
Solving inverse frequent itemset mining with infrequency constraints via large-scale linear programs
ACM Transactions on Knowledge Discovery from Data (TKDD)
Hi-index | 0.00 |
Computing frequent itemsets and maximally frequent item-sets in a database are classic problems in data mining. The resource requirements of all extant algorithms for both problems depend on the distribution of frequent patterns, a topic that has not been formally investigated. In this paper, we study properties of length distributions of frequent and maximal frequent itemset collections and provide novel solutions for computing tight lower bounds for feasible distributions. We show how these bounding distributions can help in generating realistic synthetic datasets, which can be used for algorithm benchmarking.