Mining of closed frequent subtrees from frequently updated databases

Authors:
Viet Anh Nguyen;Akihiro Yamamoto
Affiliations:
Graduate School of Informatics, Kyoto University, Kyoto, Japan;Graduate School of Informatics, Kyoto University, Kyoto, Japan
Venue:
Intelligent Data Analysis
Year:
2012

Citing 18
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Data on the Web: from relations to semistructured data and XML

Data on the Web: from relations to semistructured data and XML
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Online Algorithms for Mining Semi-structured Data Stream

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Indexing and Mining Free Trees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Unordered Tree Mining with Applications to Phylogeny

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
IncSpan: incremental mining of sequential patterns in large database

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
DRYADE: A New Approach for Discovering Closed Frequent Trees in Heterogeneous Tree Databases

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Graph indexing: tree + delta

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Discovering Frequent Agreement Subtrees from Phylogenetic Data

IEEE Transactions on Knowledge and Data Engineering
DryadeParent, An Efficient and Robust Closed Attribute Tree Mining Algorithm

IEEE Transactions on Knowledge and Data Engineering
Tree model guided candidate generation for mining frequent subtrees from XML documents

ACM Transactions on Knowledge Discovery from Data (TKDD)
Mining adaptively frequent closed unlabeled rooted trees in data streams

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent closed rooted trees

Machine Learning
PrefixTreeESpan: a pattern growth algorithm for mining embedded subtrees

WISE'06 Proceedings of the 7th international conference on Web Information Systems
An output-polynomial time algorithm for mining frequent closed attribute trees

ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Efficiently Mining Frequent Embedded Unordered Trees

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Frequent Subtree Mining - An Overview

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the problem of mining closed frequent subtrees from tree databases that are updated regularly over time. Closed frequent subtrees provide condensed and complete information for all frequent subtrees in the database. Although mining closed frequent subtrees is in general faster than mining all frequent subtrees, this is still a very time consuming process, and thus it is undesirable to mine from scratch when the change to the database is small. The set of previous mined closed subtrees should be reused as much as possible to compute new emerging subtrees. We propose, in this paper, a novel and efficient incremental mining algorithm for closed frequent labeled ordered trees. We adopt a divide-and-conquer strategy and apply different mining techniques in different parts of the mining process. The proposed algorithm requires no additional scan of the whole database while its memory usage is reasonable. Our experimental study on both synthetic and real-life datasets demonstrates the efficiency and scalability of our algorithm.