Simple fast algorithms for the editing distance between trees and related problems
SIAM Journal on Computing
Alignment of trees: an alternative to tree edit
Theoretical Computer Science
Analytic Variations on the Common Subexpression Problem
ICALP '90 Proceedings of the 17th International Colloquium on Automata, Languages and Programming
A Fully Automated Object Extraction System for the World Wide Web
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
Mining data records in Web pages
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fully automatic wrapper generation for search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
Web data extraction based on partial tree alignment
WWW '05 Proceedings of the 14th international conference on World Wide Web
Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications
IEEE Transactions on Knowledge and Data Engineering
Frequent Subtree Mining - An Overview
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Hi-index | 0.00 |
We study a novel problem of mining subtrees with frequent occurrence of similar subtrees, and propose an algorithm for this problem. In our problem setting, frequency of a subtree is counted not only for equivalent subtrees but also for similar subtrees. According to our experiment using tag trees of web pages, this problem can be solved fast enough for practical use. An encouraging result was obtained in a preliminary experiment for data record extraction from web pages using our mining method.