Extracting semi-structured data through examples

  • Authors:
  • Berthier Ribeiro-Neto;Alberto H. F. Laender;Altigran S. da Silva

  • Affiliations:
  • Department of Computer Science, Federal University of Minas Gerais, 31270-901 Belo Horizonte MG, Brazil;Department of Computer Science, Federal University of Minas Gerais, 31270-901 Belo Horizonte MG, Brazil;Department of Computer Science, Federal University of Minas Gerais, 31270-901 Belo Horizonte MG, Brazil

  • Venue:
  • Proceedings of the eighth international conference on Information and knowledge management
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use this information to extract new objects from new pages or texts. To perform the extraction of new objects, we introduce a bottom-up extration strategy and, through experimentation, demonstrate that it works quite effectively with distinct Web sources, even if only a few examples are provided by the user.