Average Number of Frequent (Closed) Patterns in Bernouilli and Markovian Databases

  • Authors:
  • Francois Rioult;Arnaud Soulet

  • Affiliations:
  • Université de Caen Basse-Normandie;Université de Caen Basse-Normandie

  • Venue:
  • ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In data mining, enumerate the frequent or the closed patterns is often the first difficult task leading to the association rules discovery. The number of these patterns represents a great interest. The lower bound is known to be constant whereas the upper bound is exponential, but both situations correspond to pathological cases. For the first time, we give an average analysis of the number of frequent or closed patterns. Average analysis is often closer to real situations and gives more information about the role of the parameters. In this paper, two probabilistic models are studied: a BERNOULLI and a MARKOVIAN. In both models and for large databases, we prove that the number of frequent patterns, for a fixed frequency threshold , is exponential in the number of items and polynomial in the number of transactions. On the other hand, for a proportional frequency threshold , the number of frequent patterns is polynomial in the number of items and does not involve the number of transactions. Finally, we prove in the BERNOULLI model that the number of closed patterns, for a proportional frequency threshold, is polynomial in the number of items.