Static load balancing of parallel mining of frequent itemsets using reservoir sampling

  • Authors:
  • Robert Kessl

  • Affiliations:
  • Czech Academy of Science, Institute of Computer Science, Prague

  • Venue:
  • MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a novel method for parallelization of an arbitrary depth-first search (DFS in short) algorithm for mining of all FIs. The method is based on the so called reservoir sampling algorithm. The reservoir sampling algorithm in combination with an arbitrary DFS mining algorithm executed on a database sample takes an uniformly but not independently distributed sample of all FIs using the reservoir sampling. The sample is then used for static load-balancing of the computational load of a DFS algorithm for mining of all FIs.