A Workflow-Based Approach for Creating Complex Web Wrappers

  • Authors:
  • Paula Montoto;Alberto Pan;Juan Raposo;José Losada;Fernando Bellas;Javier López

  • Affiliations:
  • Department of Information and Communication Technologies, University of A Coruña, Spain;Department of Information and Communication Technologies, University of A Coruña, Spain;Department of Information and Communication Technologies, University of A Coruña, Spain;Department of Information and Communication Technologies, University of A Coruña, Spain;Department of Information and Communication Technologies, University of A Coruña, Spain;Department of Information and Communication Technologies, University of A Coruña, Spain

  • Venue:
  • WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

In order to let software programs access and use the information and services provided by web sources, wrapper programs must be built to provide a "machine-readable" view over them. Although research literature on web wrappers is vast, the problem of how to specify the internal logic of complex wrappers in a graphical and simple way remains mainly ignored. In this paper, we propose a new language for addressing this task. Our approach leverages on the existing work on intelligent web data extraction and automatic web navigation as building blocks, and uses a workflow-based approach to specify the wrapper control logic. The features included in the language have been decided from the results of a study of a wide range of real web automation applications from different business areas. In this paper, we also present the most salient results of the study.