Frequent pattern mining from high-dimensional data using record space search

Authors:
Kouichirou Mori;Ryohei Orihara
Affiliations:
Toshiba Corporation, Saiwai-ku, Kawasaki, Kanagawa, Japan;Toshiba Corporation, Saiwai-ku, Kawasaki, Kanagawa, Japan
Venue:
AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
Year:
2008

Citing 4
Cited 0

Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel and Distributed Association Mining: A Survey

IEEE Concurrency
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional frequent pattern mining methods have a problem in that the order of calculation exponentially increases with high-dimensional data because of a search using combinations of attributes. The purpose of our work is to develop methods that efficiently extract frequent patterns from very high-dimensional data. We propose HD FPM that can solve the problem using a record space search and a minimum pattern length pruning. The record space search means the search using combinations of records. We can extract frequent patterns from attributes common to the combinations of records. We can also reduce a search space using a minimum pattern length pruning. Several experiments on real microarray datasets show that HD FPM has better performance than previous closed frequent pattern mining algorithms such as FPclose and CHARM in the case that minimum support is low. We also propose parallel HD FPM that can solve the problem using vertical partitioning of a database and parallel processing. Our evaluation of parallel HD FPM performed with a real microarray dataset on 16 PCs has revealed that it is 13 times faster than a sequential one. In conclusion, HD FPM and parallel HD FPM are effective algorithms for frequent pattern mining from high-dimensional data.