Mining frequent trees with node-inclusion constraints

Authors:
Atsuyoshi Nakamura;Mineichi Kudo
Affiliations:
Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan;Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan
Venue:
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Year:
2005

Citing 7
Cited 1

Wrapper induction: efficiency and expressiveness

Artificial Intelligence - Special issue on Intelligent internet systems
A flexible learning system for wrapping tables and lists in HTML documents

Proceedings of the 11th international conference on World Wide Web
Mining Sequential Patterns with Regular Expression Constraints

IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining

A statistical interestingness measures for XML based association rules

PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose an efficient algorithm enumerating all frequent subtrees containing all special nodes that are guaranteed to be included in all trees belonging to a given data. Our algorithm is a modification of TreeMiner algorithm [10] so as to efficiently generate only candidate subtrees satisfying our constraints. We report mining results obtained by applying our algorithm to the problem of finding frequent structures containing the name and reputation of given restaurants in Web pages collected by a search engine.