Automation of the deep web with user defined behaviours

  • Authors:
  • Vicente Luque Centeno;Carlos Delgado Kloos;Peter T. Breuer;Luis Sánchez Fernández;Ma. Eugenia Gonzalo Cabellos;Juan Antonio Herráiz Pérez

  • Affiliations:
  • Departamento de Ingeniería Telemática, Universidad Carlos III de Madrid, Madrid, Spain;Departamento de Ingeniería Telemática, Universidad Carlos III de Madrid, Madrid, Spain;Departamento de Ingeniería Telemática, Universidad Carlos III de Madrid, Madrid, Spain;Departamento de Ingeniería Telemática, Universidad Carlos III de Madrid, Madrid, Spain;Departamento de Ingeniería Telemática, Universidad Carlos III de Madrid, Madrid, Spain;Departamento de Ingeniería Telemática, Universidad Carlos III de Madrid, Madrid, Spain

  • Venue:
  • AWIC'03 Proceedings of the 1st international Atlantic web intelligence conference on Advances in web intelligence
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Giving semantics to Web data is an issue for automated Web navigation. Since legacy Web pages have been built using HTML as a visualization-oriented markup for years, data on the Web is suitable for people using browsers, but not for programs automatically performing a task on the Web on behalf of their users. The W3C Semantic Web initiative [16] tries to solve this by explicitly declaring semantic descriptions in (typically RDF [19] and OWL [23]) metadata associated to Web pages and ontologies combined with semantic rules. This way, inference-enabled agents may deduce which actions (links to be followed, forms to be filled,...) should be executed in order to retrieve the results for a user's query. However, something more than inferring how to retrieve information from the Web is needed to automate tasks on the Web. Information retrieval [3] is only the first step. Other actions like relevant data extraction, data homogeneization and user definable processing are needed as well for automating Web-enabled applications running on Web servers. This paper proposes two programming languages for instructing assistants about how to explore Web sites according to the user's aims, providing a real example from the legacy deep Web.