Beyond Market Baskets: Generalizing Association Rules to Dependence Rules

  • Authors:
  • Craig Silverstein;Sergey Brin;Rajeev Motwani

  • Affiliations:
  • Department of Computer Science, Stanford University, Stanford, CA 94305.;Department of Computer Science, Stanford University, Stanford, CA 94305.;Department of Computer Science, Stanford University, Stanford, CA 94305.

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 1998

Quantified Score

Hi-index 0.01

Visualization

Abstract

One of the more well-studied problems in data mining is thesearch for association rules in market basket data. Association rulesare intended to identify patterns of the type: “A customer purchasingitem A often also purchases item B.” Motivated partly by the goalof generalizing beyond market basket data and partly by the goal ofironing out some problems in the definition of association rules, wedevelop the notion of dependence rules that identifystatistical dependence in both the presence and absence of items initemsets. We propose measuring significance of dependence via thechi-squared test for independence from classical statistics. Thisleads to a measure that is upward-closed in the itemset lattice,enabling us to reduce the mining problem to the search for a borderbetween dependent and independent itemsets in the lattice. We developpruning strategies based on the closure property and thereby devise anefficient algorithm for discovering dependence rules. Wedemonstrate our algorithm‘s effectiveness by testing it on censusdata, text data (wherein we seek term dependence), and syntheticdata.