Unordered Tree Mining with Applications to Phylogeny

Authors:
Dennis Shasha;Jason T. L. Wang;Sen Zhang
Affiliations:
-;-;-
Venue:
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Year:
2004

Citing 24
Cited 15

Fast algorithms for finding nearest common ancestors

SIAM Journal on Computing
CLIP: concept learning from inference patterns

Artificial Intelligence - Special issue: AI research in Japan
Ordered and Unordered Tree Inclusion

SIAM Journal on Computing
Fast discovery of association rules

Advances in knowledge discovery and data mining
Discovering typical structures of documents: a road map approach

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Tree pattern matching and subset matching in deterministic O(n log3 n)-time

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Faster Subtree Isomorphism

Journal of Algorithms
Algorithmics and applications of tree and graph searching

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Discovering Structural Association of Semistructured Data

IEEE Transactions on Knowledge and Data Engineering
Finding Patterns in Three-Dimensional Graphs: Algorithms and Applications to Scientific Data Mining

IEEE Transactions on Knowledge and Data Engineering
Approximate Graph Schema Extraction for Semi-Structured Data

EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Counting Twig Matches in a Tree

Proceedings of the 17th International Conference on Data Engineering
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
The LCA Problem Revisited

LATIN '00 Proceedings of the 4th Latin American Symposium on Theoretical Informatics
Correlating XML data streams using tree-edit distance embeddings

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
ANF: a fast and scalable tool for data mining in massive graphs

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Online Algorithms for Mining Semi-structured Data Stream

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
DTD-Miner: A Tool for Mining DTD from XML Documents

WECWIS '00 Proceedings of the Second International Workshop on Advance Issues of E-Commerce and Web-Based Information Systems (WECWIS 2000)
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
TreeRank: a similarity measure for nearest neighbor searching in phylogenetic database

SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
Substructure discovery using minimum description length and background knowledge

Journal of Artificial Intelligence Research

BIO-AJAX: an extensible framework for biological data cleaning

ACM SIGMOD Record
Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees

IEEE Transactions on Knowledge and Data Engineering
Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
Frequent Subtree Mining - An Overview

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Discovering Frequent Agreement Subtrees from Phylogenetic Data

IEEE Transactions on Knowledge and Data Engineering
Efficient mining of frequent XML query patterns with repeating-siblings

Information and Software Technology
An integrated, generic approach to pattern mining: data mining template library

Data Mining and Knowledge Discovery
Mining Unordered Distance-Constrained Embedded Subtrees

DS '08 Proceedings of the 11th International Conference on Discovery Science
Split-Order Distance for Clustering and Classification Hierarchies

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Mining tree-structured data on multicore systems

Proceedings of the VLDB Endowment
Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams

Proceedings of the 2010 conference on Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams
Mining frequent closed trees in evolving data streams

Intelligent Data Analysis - Ubiquitous Knowledge Discovery
Biomonitoring, phylogenetics and anomaly aggregation systems

ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
Frequent Subtree Mining - An Overview

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Mining of closed frequent subtrees from frequently updated databases

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent structure mining (FSM) aims to discover andextract patterns frequently occuring in structural data,such as trees and graphs.FSM finds many applications inbioinformatics, XML processing, Web log analysis, and soon.In this paper we present a new FSM technique for findingpatterns in rooted unordered labeled trees.The patternsof interest are cousin pairs in these trees.A cousin pair isa pair of nodes sharing the same parent, the same grand-parent,or the same great-grandparent, etc.Given a treeT, our algorithm finds all interesting cousin pairs of T inO(|T|2) time when |T| is the number of nodes in T.Experimentalresults on synthetic data and phylogenies showthe scalability and effectiveness of the proposed technique.To demonstrate the usefulness of our approach, we discussits applications to locating co-occurring patterns in multipleevolutionary trees, evaluating the consensus of equallyparsimonious trees, and finding kernel trees of groups ofphylogenies.We also describe extensions of our algorithmsfor undirected acyclic graphs (or free trees).