Mining frequent trees based on topology projection

  • Authors:
  • Ma Haibing;Wang Chen;Li Ronglu;Liu Yong;Hu Yunfa

  • Affiliations:
  • Computer and Information Technology Department, Fudan University, Shanghai, China;Computer and Information Technology Department, Fudan University, Shanghai, China;Computer and Information Technology Department, Fudan University, Shanghai, China;Computer and Information Technology Department, Fudan University, Shanghai, China;Computer and Information Technology Department, Fudan University, Shanghai, China

  • Venue:
  • APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Methods for mining frequent trees are widely used in domains like bioinformatics, web-mining, chemical compound structure mining, and so on. In this paper, we present TG, an efficient pattern growth algorithm for mining frequent embedded suttees in a forest of rooted, labeled, and ordered trees. It uses rightmost path expansion scheme to construct complete pattern growth space, and creates a projected database for every grow point of the pattern ready to grow. Then, the problem is transformed from mining frequent trees to finding frequent nodes in the projected database. We conduct detailed experiments to test its performance and scalability and find that TG outperforms TreeMiner, one of the fastest methods proposed before, by a factor of 4 to 15.