Information extraction from HTML: application of a general machine learning approach
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A brief survey of web data extraction tools
ACM SIGMOD Record
A formal model for an expressive fragment of XSLT
Information Systems - Databases: Creation, management and utilization
Information Extraction in Structured Documents Using Tree Automata Induction
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Information Extraction from HTML: Combining XML and Standard Techniques for IE from the Web
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
Knowledge Discovery from Semistructured Texts
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
E-Commerce: Business, Technology, Society
E-Commerce: Business, Technology, Society
Adaptive information extraction: core technologies for information agents
Intelligent information agents
Tuples extraction from HTML using logic wrappers and inductive logic programming
AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
WDEE: web data extraction by example
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Logic wrappers and XSLT transformations for tuples extraction from HTML
XSym'05 Proceedings of the Third international conference on Database and XML Technologies
Hi-index | 0.00 |
The work described here is part of an ongoing research on the application of general-purpose inductive logic programming, logic representation of wrappers (L-wrappers) and XML technologies (including the XSLT transformation language) to information extraction from the Web. The L-wrappers methodology is based on a sound theoretical approach and has already proved its efficacy on a smaller scale, in the area of collecting product information. This paper proposes the use of L-wrappers for tuple extraction from HTML in the domain of e-tourism. It also outlines a method for translating L-wrappers into XSLT and illustrates it with the example of a real-world travel agency Web site.