Wrapper Generation via Grammar Induction

  • Authors:
  • Boris Chidlovskii;Jon Ragetli;Maarten de Rijke

  • Affiliations:
  • -;-;-

  • Venue:
  • ECML '00 Proceedings of the 11th European Conference on Machine Learning
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

To facilitate effective search on the World Wide Web, meta search engines have been developed which do not search the Web themselves, but use available search engines to find the required information. By means of wrappers, meta search engines retrieve information from the pages returned by search engines. We present an approach to automatically create such wrappers by means of an incremental grammar induction algorithm. The algorithm uses an adaptation of the string edit distance. Our method performs well; it is quick, can be used for several types of result pages and requires a minimal amount of user interaction.