Itemset frequency satisfiability: Complexity and axiomatization

  • Authors:
  • Toon Calders

  • Affiliations:
  • Eindhoven University of Technology, Den Dolech 2, HG 7.82a, P.O. Box 513, 5600 MB Eindhoven, The Netherlands and University of Antwerp, Belgium

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2008

Quantified Score

Hi-index 5.23

Visualization

Abstract

Computing frequent itemsets is one of the most prominent problems in data mining. We study the following related problem, called FREQSAT, in depth: given some itemset-interval pairs, does there exist a database such that for every pair the frequency of the itemset falls into the interval? This problem is shown to be NP-complete. The problem is then further extended to include arbitrary Boolean expressions over items and conditional frequency expressions in the form of association rules. We also show that, unless P equals NP, the related function problem-find the best interval for an itemset under some frequency constraints-cannot be approximated efficiently. Furthermore, it is shown that FREQSAT is recursively axiomatizable, but that there cannot exist an axiomatization of finite arity.