A "Go With the Winners" approach to finding frequent patterns

  • Authors:
  • Todd Ebert;Darin Goldstein

  • Affiliations:
  • California State University, Long Beach;California State University, Long Beach

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In their seminal work on Go With the Winners (GWW) algorithms, D. Aldous and U. Vazirani [3] proved a sufficient condition for the number of particles needed for reaching the bottom of a tree with high probability via a GWW random walk. However, to use this result in practice would require knowledge of the entire search tree which is infeasible for most problems. In this paper we improve slightly on this situation by deriving a recurrence relation that provides an upper-bound for a tree's imbalance in terms of the imbalance between tree levels that are close to one another, provided that these latter imbalances can be measured with sufficient accuracy.We then turn our attention to the problem of finding both frequent and infrequent patterns in a database. One of the most widely used algorithms for finding frequent patterns in memory-resident databases is a randomized algorithm first proposed by Gunopulos et al. [12]. We show that such an algorithm is precisely one for which the GWW paradigm was designed to improve on. Experimental results using the Splice-junction Gene Sequences Database [4] are also provided and lend empirical evidence of the benefits of using GWW.