Communications of the ACM
Handbook of theoretical computer science (vol. B)
Ordered and Unordered Tree Inclusion
SIAM Journal on Computing
Learning unions of tree patterns using queries
Theoretical Computer Science - Special issue on algorithmic learning theory
Learning first order universal Horn expressions
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Information extraction from HTML: application of a general machine learning approach
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning page-independent heuristics for extracting data from Web pages
WWW '99 Proceedings of the eighth international conference on World Wide Web
Data on the Web: from relations to semistructured data and XML
Data on the Web: from relations to semistructured data and XML
Wrapper induction: efficiency and expressiveness
Artificial Intelligence - Special issue on Intelligent internet systems
Learning to construct knowledge bases from the World Wide Web
Artificial Intelligence - Special issue on Intelligent internet systems
Machine Learning
Machine Learning
Identification of Tree Translation Rules from Examples
ICGI '00 Proceedings of the 5th International Colloquium on Grammatical Inference: Algorithms and Applications
Extracting Partial Structures from HTML Documents
Proceedings of the Fourteenth International Florida Artificial Intelligence Research Society Conference
Learning Acyclic First-Order Horn Sentences from Entailment
ALT '97 Proceedings of the 8th International Conference on Algorithmic Learning Theory
Mining Semi-structured Data by Path Expressions
DS '01 Proceedings of the 4th International Conference on Discovery Science
Tractable and intractable second-order matching problems
COCOON'99 Proceedings of the 5th annual international conference on Computing and combinatorics
Information Extraction in Structured Documents Using Tree Automata Induction
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Information extraction from structured documents using k-testable tree automaton inference
Data & Knowledge Engineering
Information extraction from web documents based on local unranked tree automaton inference
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Tuples extraction from HTML using logic wrappers and inductive logic programming
AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
Mining travel resources on the web using l-wrappers
ICAISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Soft Computing
Hi-index | 0.00 |
This paper surveys our recent results on the knowledge discovery from semistructured texts, which contain heterogeneous structures represented by labeled trees. The aim of our study is to extract useful information from documents on the Web. First, we present the theoretical results on learning rewriting rules between labeled trees. Second, we apply our method to the learning HTML trees in the framework of the wrapper induction. We also examine our algorithms for real world HTML documents and present the results.