Improving parallelism in structural data mining

  • Authors:
  • Min Cai;Istvan Jonyer;Marcin Paprzycki

  • Affiliations:
  • Department of Computer Science, Oklahoma State University, Stillwater, Oklahoma;Department of Computer Science, Oklahoma State University, Stillwater, Oklahoma;Computer Science Institute, SWPS, Warsaw, Poland

  • Venue:
  • PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large amount of data collected daily requires efficient algorithms for its processing. The SUBDUE data mining system discovers substructures in structurally complex data, based on the minimum description length principle. Its parallel implementation, MPI-SUBDUE, was created in 2001 to reduce computation time and/or to deal with larger datasets. In this paper, a new, more efficient implementation of MPI-SUBDUE is introduced. The experimental results show that, for the mutagenesis dataset, the new implementation outperforms the original one by up to 33% and that the performance gain increases with the number of processors used.