Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on uniform random number generation
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Discovering All Most Specific Sentences by Randomized Algorithms
ICDT '97 Proceedings of the 6th International Conference on Database Theory
On the Complexity of Generating Maximal Frequent and Minimal Infrequent Sets
STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
Discovering all most specific sentences
ACM Transactions on Database Systems (TODS)
An inequality limiting the number of maximal frequent sets
An inequality limiting the number of maximal frequent sets
Hi-index | 0.00 |
In their seminal work on Go With the Winners (GWW) algorithms, D. Aldous and U. Vazirani [3] proved a sufficient condition for the number of particles needed for reaching the bottom of a tree with high probability via a GWW random walk. However, to use this result in practice would require knowledge of the entire search tree which is infeasible for most problems. In this paper we improve slightly on this situation by deriving a recurrence relation that provides an upper-bound for a tree's imbalance in terms of the imbalance between tree levels that are close to one another, provided that these latter imbalances can be measured with sufficient accuracy.We then turn our attention to the problem of finding both frequent and infrequent patterns in a database. One of the most widely used algorithms for finding frequent patterns in memory-resident databases is a randomized algorithm first proposed by Gunopulos et al. [12]. We show that such an algorithm is precisely one for which the GWW paradigm was designed to improve on. Experimental results using the Splice-junction Gene Sequences Database [4] are also provided and lend empirical evidence of the benefits of using GWW.