Communications of the ACM
Efficient learning of context-free grammars from positive structural examples
Information and Computation
Communications of the ACM
Recent advances of grammatical inference
Theoretical Computer Science - Special issue on algorithmic learning theory
Information extraction from HTML: application of a general machine learning approach
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Generating finite-state transducers for semi-structured data extraction from the Web
Information Systems - Special issue on semistructured data
Learning Information Extraction Rules for Semi-Structured and Free Text
Machine Learning - Special issue on natural language learning
Recognizing structure in Web pages using similarity queries
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Relational learning of pattern-match rules for information extraction
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Monadic datalog and the expressive power of languages for web information extraction
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Hierarchical Wrapper Induction for Semistructured Information Sources
Autonomous Agents and Multi-Agent Systems
Machine Learning
Machine Learning
Wrapper Generation via Grammar Induction
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Probabilistic k-Testable Tree Languages
ICGI '00 Proceedings of the 5th International Colloquium on Grammatical Inference: Algorithms and Applications
Using Grammatical Inference to Automate Information Extraction from the Web
PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Information Extraction with HMM Structures Learned by Stochastic Optimization
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Knowledge Discovery from Semistructured Texts
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
A Machine Learning Approach to Rapid Development of XML Mapping Queries
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Automatic information extraction from large websites
Journal of the ACM (JACM)
Tree-Structured Template Generation for Web Pages
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Information extraction from structured documents using k-testable tree automaton inference
Data & Knowledge Engineering
Detecting Irrelevant Subtrees to Improve Probabilistic Learning from Tree-structured Data
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
WRAPPER INFERENCE FOR AMBIGUOUS WEB PAGES
Applied Artificial Intelligence
Information extraction from web documents based on local unranked tree automaton inference
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Learning (k,l)-contextual tree languages for information extraction
ECML'05 Proceedings of the 16th European conference on Machine Learning
Tuples extraction from HTML using logic wrappers and inductive logic programming
AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
Mining travel resources on the web using l-wrappers
ICAISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Soft Computing
Detecting Irrelevant Subtrees to Improve Probabilistic Learning from Tree-structured Data
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Hi-index | 0.00 |
Information extraction (IE) addresses the problem of extracting specific information from a collection of documents. Much of the previous work for IE from structured documents formatted in HTML or XML uses techniques for IE from strings, such as grammar and automata induction. However, such documents have a tree structure. Hence it is natural to investigate methods that are able to recognise and exploit this tree structure. We do this by exploring the use of tree automata for IE in structured documents. Experimental results on benchmark data sets show that our approach compares favorably with previous approaches.