Introduction to algorithms
Information extraction from HTML: application of a general machine learning approach
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Data on the Web: from relations to semistructured data and XML
Data on the Web: from relations to semistructured data and XML
Token-Templates and Logic Programs for Intelligent Web Search
Journal of Intelligent Information Systems - Special issue on methodologies for intelligent information systems
A brief survey of web data extraction tools
ACM SIGMOD Record
DEByE - Date extraction by example
Data & Knowledge Engineering
A formal model for an expressive fragment of XSLT
Information Systems - Databases: Creation, management and utilization
The Elog Web Extraction Language
LPAR '01 Proceedings of the Artificial Intelligence on Logic for Programming
Information Extraction from HTML: Combining XML and Standard Techniques for IE from the Web
Proceedings of the 14th International conference on Industrial and engineering applications of artificial intelligence and expert systems: engineering of intelligent systems
Monadic datalog and the expressive power of languages for Web information extraction
Journal of the ACM (JACM)
Conjunctive queries over trees
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Learning Logic Wrappers for Information Extraction from the Web
SAINT-W '05 Proceedings of the 2005 Symposium on Applications and the Internet Workshops
Adaptive information extraction: core technologies for information agents
Intelligent information agents
Tuples extraction from HTML using logic wrappers and inductive logic programming
AWIC'05 Proceedings of the Third international conference on Advances in Web Intelligence
Mining travel resources on the web using l-wrappers
ICAISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Soft Computing
Hi-index | 0.00 |
Recently it was shown that existing general-purpose inductive logic programming systems are useful for learning wrappers (known as L-wrappers) to extract data from HTML documents. Here we propose a formalization of L-wrappers and their patterns, including their syntax and semantics and related properties and operations. A mapping of the patterns to a subset of XSLT that has a formal semantics is outlined and demonstrated by an example. The mapping actually shows how the theory can be applied to obtain efficient wrappers for information extraction from HTML.