Extracting Share Frequent Itemsets with Infrequent Subsets

  • Authors:
  • Brock Barber;Howard J. Hamilton

  • Affiliations:
  • Department of Computer Science, University of Regina, 3737 Wascana Parkway, Regina, SK S4S 0A2, Canada;Department of Computer Science, University of Regina, 3737 Wascana Parkway, Regina, SK S4S 0A2, Canada. hamilton@cs.uregina.ca

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Itemset share has been proposed as an additional measure of the importance of itemsets in association rule mining (Carter et al., 1997). We compare the share and support measures to illustrate that the share measure can provide useful information about numerical values that are typically associated with transaction items, which the support measure cannot. We define the problem of finding share frequent itemsets, and show that share frequency does not have the property of downward closure when it is defined in terms of the itemset as a whole. We present algorithms that do not rely on the property of downward closure, and thus are able to find share frequent itemsets that have infrequent subsets. The algorithms use heuristic methods to generate candidate itemsets. They supplement the information contained in the set of frequent itemsets from a previous pass, with other information that is available at no additional processing cost. They count only those generated itemsets that are predicted to be frequent. The algorithms are applied to a large commercial database and their effectiveness is examined using principles of classifier evaluation from machine learning.