A Supervised Visual Wrapper Generator for Web-Data Extraction

  • Authors:
  • Xiaofeng Meng;Haiyan Wang;Dongdong Hu;Chen Li

  • Affiliations:
  • -;-;-;-

  • Venue:
  • COMPSAC '03 Proceedings of the 27th Annual International Conference on Computer Software and Applications
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extracting data from Web pages using wrappers is afundamental problem arising in a large variety ofapplications of vast practical interest. In this paper, wepropose a novel schema-guided approach to wrappergeneration. We provide a user-friendly interface thatallows users to define the schema of the data to beextracted, and specifies mappings from a HTML page tothe target schema. Based on the mappings, the systemcan automatically generate an extraction rule to extractdata from the page. Our approach to wrapper generationcan significantly reduce the work of human beings inthis process. And the user never have to deal with theinternal extraction rule, or even familiarity with thedetails of HTML.