Efficiently Mining Frequent Embedded Unordered Trees

Authors:
Mohammed J. Zaki
Affiliations:
(Correspd.) Computer Science Department, Rensselaer Polytechnic Institute, Troy NY 12180, USA. zaki@cs.rpi.edu
Venue:
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Year:
2005

Citing 20
Cited 7

CLIP: concept learning from inference patterns

Artificial Intelligence - Special issue: AI research in Japan
Ordered and Unordered Tree Inclusion

SIAM Journal on Computing
Fast discovery of association rules

Advances in knowledge discovery and data mining
Discovering typical structures of documents: a road map approach

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Tree pattern matching and subset matching in deterministic O(n log3 n)-time

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Faster Subtree Isomorphism

Journal of Algorithms
Molecular feature mining in HIV data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TreeFinder: a First Step towards XML Data Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Indexing and Mining Free Trees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient Data Mining for Maximal Frequent Subtrees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
XRules: an effective structural classifier for XML data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
HybridTreeMiner: An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Substructure discovery using minimum description length and background knowledge

Journal of Artificial Intelligence Research

Mining Unordered Distance-Constrained Embedded Subtrees

DS '08 Proceedings of the 11th International Conference on Discovery Science
Information Extraction by XLM

KES '07 Knowledge-Based Intelligent Information and Engineering Systems and the XVII Italian Workshop on Neural Networks on Proceedings of the 11th International Conference
U3 - Mning Unordered Embedded Subtrees Using TMG Candidate Generation

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Tree representation of digital picture embeddings

Journal of Visual Communication and Image Representation
An Experimental Comparison of Different Inclusion Relations in Frequent Tree Mining

Fundamenta Informaticae - Progress on Multi-Relational Data Mining
Performance oriented schema matching

DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Mining of closed frequent subtrees from frequently updated databases

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. In this paper we introduce SLEUTH, an efficient algorithm for mining frequent, unordered, embedded subtrees in a database of labeled trees. The key contributions of our work are as follows: We give the first algorithm that enumerates all embedded, unordered trees. We propose a new equivalence class extension scheme to generate all candidate trees. We extend the notion of scope-list joins to compute frequency of unordered trees. We conduct performance evaluation on several synthetic and real datasets to show that SLEUTH is an efficient algorithm, which has performance comparable to TreeMiner, that mines only ordered trees.