A Randomized Approach for Approximating the Number of Frequent Sets

  • Authors:
  • Mario Boley;Henrik Grosskreutz

  • Affiliations:
  • -;-

  • Venue:
  • ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the problem of counting the number of frequent (item)sets---a problem known to be intractable in terms of an exact polynomial time computation. In this paper, we show that it is in general also hard to approximate. Subsequently, a randomized counting algorithm is developed using the Markov chain Monte Carlo method. While for general inputs an exponential running time is needed in order to guarantee a certain approximation bound, we empirically show that the algorithm still has the desired accuracy on real-world datasets when its running time is capped polynomially.