Toolkits for Generating Wrappers

  • Authors:
  • Stefan Kuhlins;Ross Tredwell

  • Affiliations:
  • -;-

  • Venue:
  • NODe '02 Revised Papers from the International Conference NetObjectDays on Objects, Components, Architectures, Services, and Applications for a Networked World
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Various web applications in e-business, such as online price comparisons, competition monitoring and personalised newsletters require retrieval of distributed information from the Internet. This paper examines the suitability of software toolkits for the extraction of data from web sites. The term wrapper is defined and an overview of presently available toolkits for generating wrappers is provided. In order to give a better insight into the workings of such toolkits, a detailed analysis of the non-commercial software program LAPIS is presented. An example application using this toolkit demonstrates how acceptable results can be achieved with relative ease. The functionality of the program is compared with the functionality of the commercial toolkit RoboMaker and the differences are highlighted. With the aim of providing improved ease-of-use and faster wrapper generation in mind, possible areas for further development of toolkits for automated web data extraction are discussed.