Mining subtrees with frequent occurrence of similar subtrees

  • Authors:
  • Hisashi Tosaka;Atsuyoshi Nakamura;Mineichi Kudo

  • Affiliations:
  • Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan;Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan;Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan

  • Venue:
  • DS'07 Proceedings of the 10th international conference on Discovery science
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study a novel problem of mining subtrees with frequent occurrence of similar subtrees, and propose an algorithm for this problem. In our problem setting, frequency of a subtree is counted not only for equivalent subtrees but also for similar subtrees. According to our experiment using tag trees of web pages, this problem can be solved fast enough for practical use. An encouraging result was obtained in a preliminary experiment for data record extraction from web pages using our mining method.