Generating finite-state transducers for semi-structured data extraction from the Web
Information Systems - Special issue on semistructured data
Recognizing structure in Web pages using similarity queries
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Learning dictionaries for information extraction by multi-level bootstrapping
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Extracting semi-structured data through examples
Proceedings of the eighth international conference on Information and knowledge management
Conceptual-model-based data extraction from multiple-record Web pages
Data & Knowledge Engineering
Wrapper induction: efficiency and expressiveness
Artificial Intelligence - Special issue on Intelligent internet systems
A Fully Automated Object Extraction System for the World Wide Web
ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems
A brief survey of web data extraction tools
ACM SIGMOD Record
Collecting hidden weeb pages for data extraction
Proceedings of the 4th international workshop on Web information and data management
The Debye Environment for Web Data Management
IEEE Internet Computing
A Framework for Generating Attribute Extractors for Web Data Sources
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
The Web-DL environment for building digital libraries from the Web
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Automatic generation of agents for collecting hidden web pages for data extraction
Data & Knowledge Engineering - Special issue: WIDM 2002
Adapting Web information extraction knowledge via mining site-invariant and site-dependent features
ACM Transactions on Internet Technology (TOIT)
Bootstrapping Information Extraction from Semi-structured Web Pages
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Cross Language Information Extraction Knowledge Adaptation
RSKT '09 Proceedings of the 4th International Conference on Rough Sets and Knowledge Technology
Normalizing web product attributes and discovering domain ontology with minimal effort
Proceedings of the fourth ACM international conference on Web search and data mining
Learning to adapt cross language information extraction wrapper
Applied Intelligence
Hi-index | 0.00 |
The effortless generation of wrappers for Web data sources is a crucial task if proper access to the huge amount of semi-structured data on the Web is to be granted. In particular, the development of strategies for wrapper generation based on user-given examples is currently one of the most promising research directions in Web data extraction. In this paper we show how to use a pre-existing data repository to automatically generate examples and allow full automated example-based data extraction. To demonstrate the feasibility of our approach we provide a number of results obtained from experiments we carried out and discuss how our ideas can be used to improve extraction rates and for providing resilience and adaptiveness for example-based generated wrappers.