C4.5: programs for machine learning
C4.5: programs for machine learning
Efficient agnostic PAC-learning with simple hypothesis
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Toward Efficient Agnostic Learning
Machine Learning - Special issue on computational learning theory, COLT'92
Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Transversing itemset lattices with statistical metric pruning
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms
Data Structures and Algorithms
Data Structures and Algorithms
Discovering Structural Association of Semistructured Data
IEEE Transactions on Knowledge and Data Engineering
Mining Optimized Association Rules with Categorical and Numeric Attributes
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovering Unordered and Ordered Phrase Association Patterns for Text Mining
PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Discovery of Frequent Tree Structured Patterns in Semistructured Web Documents
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
On Classification and Regression
DS '98 Proceedings of the First International Conference on Discovery Science
Graph-Based Induction for General Graph Structured Data
DS '99 Proceedings of the Second International Conference on Discovery Science
On the Difficulty of Approximately Maximizing Agreements
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Frequent free tree discovery in graph data
Proceedings of the 2004 ACM symposium on Applied computing
SPIN: mining maximal frequent subgraphs from graph databases
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
An improved extraction pattern representation model for automatic IE pattern acquisition
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Key semantics extraction by dependency tree mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
TRIPS and TIDES: new algorithms for tree mining
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Discriminative pattern mining in software fault detection
Proceedings of the 3rd international workshop on Software quality assurance
Boosting-based parse reranking with subtree features
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Speeding up training with tree kernels for node relation labeling
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
FAT-miner: mining frequent attribute trees
Proceedings of the 2007 ACM symposium on Applied computing
Automatic creation of domain templates
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
On-demand information extraction
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Discovering frequent geometric subgraphs
Information Systems
Tree model guided candidate generation for mining frequent subtrees from XML documents
ACM Transactions on Knowledge Discovery from Data (TKDD)
Accelerating genetic programming by frequent subtree mining
Proceedings of the 10th annual conference on Genetic and evolutionary computation
Fast logistic regression for text categorization with variable-length n-grams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding Frequent Patterns from Compressed Tree-Structured Data
DS '08 Proceedings of the 11th International Conference on Discovery Science
System demonstration of on-demand information extraction
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Proceedings of the 2005 conference on Multi-Relational Data Mining
A task-based comparison of information extraction pattern models
DeepLP '07 Proceedings of the Workshop on Deep Linguistic Processing
Semi-structure mining method for text mining with a chunk-based dependency structure
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Efficient algorithms for mining frequent and closed patterns from semi-structured data
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
POTMiner: mining ordered, unordered, and partially-ordered trees
Knowledge and Information Systems
Efficient algorithms for finding frequent substructures from semi-structured data streams
JSAI'03/JSAI04 Proceedings of the 2003 and 2004 international conference on New frontiers in artificial intelligence
Incremental mining of closed frequent subtrees
DS'10 Proceedings of the 13th international conference on Discovery science
Knowledge exploratory project for nanodevice design and manufacturing
Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
LGM: mining frequent subgraphs from linear graphs
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
A new sequential mining approach to XML document clustering*
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
IMB3-Miner: mining induced/embedded subtrees by constraining the level of embedding
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Clustering and retrieval of XML documents by structure
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part II
EXiT-B: a new approach for extracting maximal frequent subtrees from XML data
IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Extraction of interesting financial information from heterogeneous XML-Based data
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Sentiment classification using word sub-sequences and dependency sub-trees
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A structure preserving flat data format representation for tree-structured data
PAKDD'11 Proceedings of the 15th international conference on New Frontiers in Applied Data Mining
A Dichotomic Search Algorithm for Mining and Learning in Domain-Specific Logics
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
FPI: a novel indexing method using frequent patterns for approximate string searches
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Analysis of Textual Data Based on Inductive Learning Techniques
International Journal of Information Retrieval Research
Hi-index | 0.00 |
In this paper, we consider the problem of discovering interesting substructures from a large collection of semi-structured data in the framework of optimized pattern discovery. We model semi-structured data and patterns with labeled ordered trees, and present an efficient algorithm that discovers the best labeled ordered trees that optimize a given statistical measure, such as the information entropy and the classification accuracy, in a collection of semi-structured data. We give theoretical analyses of the computational complexity of the algorithm for patterns with bounded and unbounded size. Experiments show that the algorithm performs well and discovered interesting patterns on real datasets.