EXiT-B: a new approach for extracting maximal frequent subtrees from XML data

  • Authors:
  • Juryon Paik;Dongho Won;Farshad Fotouhi;Ung Mo Kim

  • Affiliations:
  • Department of Computer Engineering, Sungkyunkwan University, Suwon, Gyeonggi-do, Republic of Korea;Department of Computer Engineering, Sungkyunkwan University, Suwon, Gyeonggi-do, Republic of Korea;Wayne State University, Detroit, MI;Department of Computer Engineering, Sungkyunkwan University, Suwon, Gyeonggi-do, Republic of Korea

  • Venue:
  • IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Along with the increasing amounts of XML data available, the data mining community has been motivated to discover the useful information from the collections of XML documents. One of the most popular approaches to find the information is to extract frequent subtrees from a set of XML trees. In this paper, we propose a novel algorithm, EXiT-B, for efficiently extracting maximal frequent subtrees from a set of XML documents. The main contribution of our algorithm is that there is no need to perform tree join operation during the phase of generating maximal frequent subtrees. Thus, the task of finding maximal frequent subtrees can be significantly simplified comparing to the previous approaches.