Learning (k,l)-contextual tree languages for information extraction from web pages

  • Authors:
  • Stefan Raeymaekers;Maurice Bruynooghe;Jan Bussche

  • Affiliations:
  • Dept. of Computer Science, K.U.Leuven, Leuven, Belgium 3001;Dept. of Computer Science, K.U.Leuven, Leuven, Belgium 3001;Universiteit Hasselt and Transnationale Universiteit Limburg, Diepenbeek, Belgium 3590

  • Venue:
  • Machine Learning
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a novel method for learning a wrapper for extraction of information from web pages, based upon (k,l)-contextual tree languages. It also introduces a method to learn good values of k and l based on a few positive and negative examples. Finally, it describes how the algorithm can be integrated in a tool for information extraction.