Automatic wrapper maintenance for semi-structured web sources using results from previous queries

  • Authors:
  • Juan Raposo;Alberto Pan;Manuel Álvarez;Ángel Viña

  • Affiliations:
  • University of A Coruña, Campus de Elviña s/n, A Coruña, Spain;University of A Coruña, Campus de Elviña s/n, A Coruña, Spain;University of A Coruña, Campus de Elviña s/n, A Coruña, Spain;University of A Coruña, Campus de Elviña s/n, A Coruña, Spain

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

During the last years, significant attention has been paid to the problem of building wrappers for extracting data from semistructured web sources. Nevertheless, since web sources are autonomous, they may experience changes that invalidate the wrappers. In this paper, we present new heuristics and algorithms to address the problem of automatic wrapper maintenance. Our approach is based on collecting query results during wrapper operation and using them later to generate new sets of examples that can be used to induce a new wrapper when the source changes.