Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications

  • Authors:
  • Mohammed J. Zaki

  • Affiliations:
  • IEEE

  • Venue:
  • IEEE Transactions on Knowledge and Data Engineering
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining frequent trees is very useful in domains like bioinformatics, Web mining, mining semistructured data, etc. We formulate the problem of mining (embedded) subtrees in a forest of rooted, labeled, and ordered trees. We present TreeMiner, a novel algorithm to discover all frequent subtrees in a forest, using a new data structure called scope-list. We contrast TreeMiner with a pattern matching tree mining algorithm (PatternMatcher), and we also compare it with TreeMinerD, which counts only distinct occurrences of a pattern. We conduct detailed experiments to test the performance and scalability of these methods. We also use tree mining to analyze RNA structure and phylogenetics data sets from bioinformatics domain.