Axiomatization of frequent itemsets

  • Authors:
  • T. Calders;J. Paredaens

  • Affiliations:
  • Departement Wiskunde-Informatica, Universiteit Antwerpen, Universiteitsplein 1, B-2610 Wilrijk, Belgium;Departement Wiskunde-Informatica, Universiteit Antwerpen, Universiteitsplein 1, B-2610 Wilrijk, Belgium

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2003

Quantified Score

Hi-index 5.23

Visualization

Abstract

Mining association rules is very popular in the data mining community. Most algorithms designed for finding association rules start with searching for frequent itemsets. Typically, in these algorithms, counting phases and pruning phases are interleaved. In the counting phase, partial information about the frequencies of selected itemsets is gathered. In the pruning phase as much as possible of the search space is pruned, based on the counting information. We introduce frequent set expressions to represent (possible partial) information acquired in the counting phase. A frequent set expression is a pair containing an itemset and a fraction that is a lower bound on the actual frequency of the itemset. A system of frequent sets is a collection of such pairs. We give an axiomatization for those systems that are complete in the sense that they explicitly contain all information they logically imply. Every system of frequent sets has a unique completion that actually represents all knowledge that can be derived. We also study sparse systems, in which not for every frequent set an expression is given. Furthermore, we explore the links with probabilistic logics.