Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
Data mining, hypergraph transversals, and machine learning (extended abstract)
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Generating non-redundant association rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering All Most Specific Sentences by Randomized Algorithms
ICDT '97 Proceedings of the 6th International Conference on Database Theory
A Tight Upper Bound on the Number of Candidate Patterns
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Optimizing hypergraph transversal computation with an anti-monotone constraint
Proceedings of the 2007 ACM symposium on Applied computing
A Data Mining Formalization to Improve Hypergraph Minimal Transversal Computation
Fundamenta Informaticae
Estimating the number of frequent itemsets in a large database
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
A Data Mining Formalization to Improve Hypergraph Minimal Transversal Computation
Fundamenta Informaticae
Information Sciences: an International Journal
Hi-index | 0.00 |
In data mining, enumerate the frequent or the closed patterns is often the first difficult task leading to the association rules discovery. The number of these patterns represents a great interest. The lower bound is known to be constant whereas the upper bound is exponential, but both situations correspond to pathological cases. For the first time, we give an average analysis of the number of frequent or closed patterns. Average analysis is often closer to real situations and gives more information about the role of the parameters. In this paper, two probabilistic models are studied: a BERNOULLI and a MARKOVIAN. In both models and for large databases, we prove that the number of frequent patterns, for a fixed frequency threshold , is exponential in the number of items and polynomial in the number of transactions. On the other hand, for a proportional frequency threshold , the number of frequent patterns is polynomial in the number of items and does not involve the number of transactions. Finally, we prove in the BERNOULLI model that the number of closed patterns, for a proportional frequency threshold, is polynomial in the number of items.