Mining Unordered Distance-Constrained Embedded Subtrees

  • Authors:
  • Fedja Hadzic;Henry Tan;Tharam Dillon

  • Affiliations:
  • DEBII, Curtin University of Technology, Perth, Australia;DEBII, Curtin University of Technology, Perth, Australia;DEBII, Curtin University of Technology, Perth, Australia

  • Venue:
  • DS '08 Proceedings of the 11th International Conference on Discovery Science
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent subtree mining is an important problem in the area of association rule mining from semi-structured or tree structured documents, often found in many commercial, web and scientific domains. This paper presents the u3Razor algorithm, for mining unordered embedded subtrees where the distance of nodes relative to the root of the subtree needs to be considered. Mining distance-constrained unordered embedded subtrees will have important applications in web information systems, conceptual model analysis and more sophisticated knowledge matching. An encoding strategy is presented to efficiently enumerate candidate unordered embedded subtrees taking the distance of nodes relative to the root of the subtree into account. Both synthetic and real-world datasets were used for experimental evaluation and discussion.