On the complexity of comparing evolutionary trees
Discrete Applied Mathematics - Special volume on computational molecular biology
Discovering typical structures of documents: a road map approach
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Aggregated Multicast - A Comparative Study
NETWORKING '02 Proceedings of the Second International IFIP-TC6 Networking Conference on Networking Technologies, Services, and Protocols; Performance of Computer and Communication Networks; and Mobile and Wireless Communications
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Frequent Quer Patterns from XML Queries
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Indexing and Mining Free Trees
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient Data Mining for Maximal Frequent Subtrees
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
XRules: an effective structural classifier for XML data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent free tree discovery in graph data
Proceedings of the 2004 ACM symposium on Applied computing
Unordered Tree Mining with Applications to Phylogeny
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Canonical forms for labelled trees and their applications in frequent subtree mining
Knowledge and Information Systems
Frequent Subtree Mining - An Overview
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Searching for high-support itemsets in itemset trees
Intelligent Data Analysis
Discovering Frequent Agreement Subtrees from Phylogenetic Data
IEEE Transactions on Knowledge and Data Engineering
Efficient mining of frequent closed XML query pattern
Journal of Computer Science and Technology
Using back-propagation to learn association rules for service personalization
Expert Systems with Applications: An International Journal
Clustering of Leaf-Labelled Trees
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
Comprehensive isomorphic subtree enumeration
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Finding Frequent Patterns from Compressed Tree-Structured Data
DS '08 Proceedings of the 11th International Conference on Discovery Science
Mining Mutually Dependent Ordered Subtrees in Tree Databases
New Frontiers in Applied Data Mining
Efficient rule based structural algorithms for classification of tree structured data
Intelligent Data Analysis
Tree mining: Equivalence classes for candidate generation
Intelligent Data Analysis
Quantitative analysis of treebanks using frequent subtree mining methods
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
Efficiently mining closed constrained frequent ordered subtrees by using border information
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Mining closed frequent free trees in graph databases
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Mining induced and embedded subtrees in ordered, unordered, and partially-ordered trees
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
Fixed-Parameter Tractability of the Maximum Agreement Supertree Problem
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Authorship classification: a syntactic tree mining approach
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
IEEE Computational Intelligence Magazine
MARGIN: Maximal frequent subgraph mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
POTMiner: mining ordered, unordered, and partially-ordered trees
Knowledge and Information Systems
NDPMine: efficiently mining discriminative numerical features for pattern-based classification
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Frequent tree pattern mining: A survey
Intelligent Data Analysis
Varro: an algorithm and toolkit for regular structure discovery in treebanks
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Mining frequent closed graphs on evolving data streams
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
PrefixTreeESpan: a pattern growth algorithm for mining embedded subtrees
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Mining maximum frequent access patterns in web logs based on unique labeled tree
WISE'06 Proceedings of the 7th international conference on Web Information Systems
WISE'06 Proceedings of the 7th international conference on Web Information Systems
Mining application repository to recommend XML configuration snippets
Proceedings of the 34th International Conference on Software Engineering
Frequent Subtree Mining - An Overview
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Fixed-parameter tractability of the maximum agreement supertree problem
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Nearly exact mining of frequent trees in large networks
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Application of tree-structured data mining for analysis of process logs in XML format
AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Integrating deep learning based perception with probabilistic logic via frequent pattern mining
AGI'13 Proceedings of the 6th international conference on Artificial General Intelligence
Key roles of closed sets and minimal generators in concise representations of frequent patterns
Intelligent Data Analysis
Hi-index | 0.00 |
Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. One important problem in mining databases of trees is to find frequently occurring subtrees. Because of the combinatorial explosion, the number of frequent subtrees usually grows exponentially with the size of frequent subtrees and, therefore, mining all frequent subtrees becomes infeasible for large tree sizes. In this paper, we present CMTreeMiner, a computationally efficient algorithm that discovers only closed and maximal frequent subtrees in a database of labeled rooted trees, where the rooted trees can be either ordered or unordered. The algorithm mines both closed and maximal frequent subtrees by traversing an enumeration tree that systematically enumerates all frequent subtrees. Several techniques are proposed to prune the branches of the enumeration tree that do not correspond to closed or maximal frequent subtrees. Heuristic techniques are used to arrange the order of computation so that relatively expensive computation is avoided as much as possible. We study the performance of our algorithm through extensive experiments, using both synthetic data and data sets from real applications. The experimental results show that our algorithm is very efficient in reducing the search space and quickly discovers all closed and maximal frequent subtrees.