Mop: An Efficient Algorithm for Mining Frequent Pattern with Subtree Traversing

  • Authors:
  • Zhi-Hong Deng;Ning Gao;Xiao-Ran Xu

  • Affiliations:
  • (Correspd.) Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China. zhdeng@cis.pku.edu.cn;School of Electronics Engineering and Computer Science, Peking University, Beijing, China;School of Electronics Engineering and Computer Science, Peking University, Beijing, China

  • Venue:
  • Fundamenta Informaticae
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining frequent patterns in database has emerged as an important task in knowledge discovery and data mining. In this paper, we present an efficient algorithm called Mop for fast frequent pattern discovery. Mop utilizes a new kind of data structure called OP_tree (ordered pattern tree) and some particular properties of frequent patterns to facilitate the process of mining frequent patterns. An OP_tree is a special frequent pattern tree, where the children of any node are sorted according to the supports of corresponding items. Efficiency of Mop is achieved with three techniques: (1) it adopts OP_tree to store a large database to avoid repetitive database scans, (2) it finds all frequent 2-patterns in the construction of OP_tree to avoid the costly generation of a large number of candidate 2-patterns, (3) the supports of candidate k-patterns (k2) can be obtained by traversing a few of specific subtrees of the OP_tree, which greatly reduces the search space and avoid multi-scans of a database. We experimentally compare our algorithm with the Apriori algorithm and the FP-growth algorithm on one real database and one synthetical database. The experimental results show that Mop is about an order of magnitude faster than the Apriori algorithm. Mop also outperforms the FP-growth algorithm, especially when support threshold is very low and databases are quite large.