Answering the Most Correlated N Association Rules Efficiently

  • Authors:
  • Jun Sese;Shinichi Morishita

  • Affiliations:
  • -;-

  • Venue:
  • PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many algorithms have been proposed for computing association rules using the support-confidence framework. One drawback of this framework is its weakness in expressing the notion of correlation. We propose an efficient algorithm for mining association rules that uses statistical metrics to determine correlation. The simple application of conventional techniques developed for the support-confidence framework is not possible, since functions for correlation do not meet the anti-monotonicity property that is crucial to traditional methods. In this paper, we propose the heuristics for the vertical decomposition of a database, for pruning unproductive itemsets, and for traversing a set-enumeration tree of itemsets that is tailored to the calculation of the N most significant association rules, where N can be specified by the user. We experimentally compared the combination of these three techniques with the previous statistical approach. Our tests confirmed that the comutational performance improves by several orders of magnitude.