Frequent pattern mining from high-dimensional data using record space search

  • Authors:
  • Kouichirou Mori;Ryohei Orihara

  • Affiliations:
  • Toshiba Corporation, Saiwai-ku, Kawasaki, Kanagawa, Japan;Toshiba Corporation, Saiwai-ku, Kawasaki, Kanagawa, Japan

  • Venue:
  • AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional frequent pattern mining methods have a problem in that the order of calculation exponentially increases with high-dimensional data because of a search using combinations of attributes. The purpose of our work is to develop methods that efficiently extract frequent patterns from very high-dimensional data. We propose HD FPM that can solve the problem using a record space search and a minimum pattern length pruning. The record space search means the search using combinations of records. We can extract frequent patterns from attributes common to the combinations of records. We can also reduce a search space using a minimum pattern length pruning. Several experiments on real microarray datasets show that HD FPM has better performance than previous closed frequent pattern mining algorithms such as FPclose and CHARM in the case that minimum support is low. We also propose parallel HD FPM that can solve the problem using vertical partitioning of a database and parallel processing. Our evaluation of parallel HD FPM performed with a real microarray dataset on 16 PCs has revealed that it is 13 times faster than a sequential one. In conclusion, HD FPM and parallel HD FPM are effective algorithms for frequent pattern mining from high-dimensional data.