Approximating the number of frequent sets in dense data

  • Authors:
  • Mario Boley;Henrik Grosskreutz

  • Affiliations:
  • Fraunhofer IAIS, Schloss Birlinghoven, 53754, Sankt Augustin, Germany;Fraunhofer IAIS, Schloss Birlinghoven, 53754, Sankt Augustin, Germany

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the problem of counting the number of frequent (item)sets—a problem known to be intractable in terms of an exact polynomial time computation. In this paper, we show that it is in general also hard to approximate. Subsequently, a randomized counting algorithm is developed using the Markov chain Monte Carlo method. While for general inputs an exponential running time is needed in order to guarantee a certain approximation bound, we show that the algorithm still has the desired accuracy on several real-world datasets when its running time is capped polynomially.