Bootstrapping for example-based data extraction

  • Authors:
  • Paulo B. Golgher;Altigran S. da Silva;Alberto H. F. Laender;Berthier Ribeiro-Neto

  • Affiliations:
  • Federal University of Minas Gerais, Belo Horizonte MG Brazil;Federal University of Minas Gerais, Belo Horizonte MG Brazil;Federal University of Minas Gerais, Belo Horizonte MG Brazil;Federal University of Minas Gerais, Belo Horizonte MG Brazil

  • Venue:
  • Proceedings of the tenth international conference on Information and knowledge management
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The effortless generation of wrappers for Web data sources is a crucial task if proper access to the huge amount of semi-structured data on the Web is to be granted. In particular, the development of strategies for wrapper generation based on user-given examples is currently one of the most promising research directions in Web data extraction. In this paper we show how to use a pre-existing data repository to automatically generate examples and allow full automated example-based data extraction. To demonstrate the feasibility of our approach we provide a number of results obtained from experiments we carried out and discuss how our ideas can be used to improve extraction rates and for providing resilience and adaptiveness for example-based generated wrappers.