Tree-based partitioning of date for association rule mining

  • Authors:
  • Shakil Ahmed;Frans Coenen;Paul Leng

  • Affiliations:
  • Department of Computer Science, The University of Liverpool, L69 3BX, Liverpool, UK;Department of Computer Science, The University of Liverpool, L69 3BX, Liverpool, UK;Department of Computer Science, The University of Liverpool, L69 3BX, Liverpool, UK

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The most computationally demanding aspect of Association Rule Mining is the identification and counting of support of the frequent sets of items that occur together sufficiently often to be the basis of potentially interesting rules. The task increases in difficulty with the scale of the data and also with its density. The greatest challenge is posed by data that is too large to be contained in primary memory, especially when high data density and/or low support thresholds give rise to very large numbers of candidates that must be counted. In this paper, we consider strategies for partitioning the data to deal effectively with such cases. We describe a partitioning approach which organises the data into tree structures that can be processed independently. We present experimental results that show the method scales well for increasing dimensions of data and performs significantly better than alternatives, especially when dealing with dense data and low support thresholds.