Preloading browsers for optimizing automatic access to hidden web: a ranking-based repository solution

  • Authors:
  • Justo Hidalgo;Alberto Pan;José Losada;Manuel Álvarez

  • Affiliations:
  • Denodo Technologies, Inc., Madrid, Spain;Department of Information and Communications Technologies, University of A Coruña, Spain;Denodo Technologies, Inc., Madrid, Spain;Department of Information and Communications Technologies, University of A Coruña, Spain

  • Venue:
  • ADBIS'06 Proceedings of the 10th East European conference on Advances in Databases and Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

As Web applications grow in terms of quantity and quality, different vertical solutions could make use of them as an important source of information. Nevertheless, obtaining information from web sources becomes a challenging issue because of their complex access due to the hypertext browsing paradigm, and HTML's semistructured format. Web Automation middleware navigates through web links and fills web forms in an automatic way, so to extract information from the Hidden Web. The main optimization parameter is the time required to navigate through the intermediate pages that lead to the desired results. This work proposes a technique which focuses on improving the browsing time by storing information from previous queries, and using it to preload an adequate subset of the navigational sequence on a specific browser, before the next sequence is launched. It also takes into account the most commonly used sequences, being the ones to be preloaded more often.