A new sampling technique for association rule mining

  • Authors:
  • Basel A. Mahafzah;Amer F. Al-Badarneh;Mohammed Z. Zakaria

  • Affiliations:
  • King Abdullah School for Information Technology, Universityof Jordan, Jordan;School of Computer and Information Technology, JordanUniversity of Science & Technology, Jordan;School of Computer and Information Technology, JordanUniversity of Science & Technology, Jordan

  • Venue:
  • Journal of Information Science
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Association Rule Mining (ARM) is one of the data mining techniques used to extract hidden knowledge from datasets, that can be used by an organization's decision makers to improve overall profit. However, performing ARM requires repeated passes over the entire database. Obviously, for large database, the role of input/output overhead in scanning the database is very significant. A popular solution to improve the speed of ARM is to apply the mining algorithm on a sample instead of the entire database. In this paper, a parameterized sampling algorithm for ARM is presented. This algorithm extracts sample datasets based on three parameters: transaction frequency, transaction length and transaction frequency-length. To evaluate its performance and accuracy, a comparison against a two-phase sampling-based algorithm is performed using real and synthetic datasets. The experimental results show that the proposed sampling algorithm in some cases outperforms two-phase sampling algorithm, and achieves up to 98% accuracy.