Accessing the deep web: when good ideas go bad

  • Authors:
  • Alfredo Alba;Varun Bhagwan;Tyrone Grandison

  • Affiliations:
  • IBM Almaden Research Center, San Jose, CA, USA;IBM Almaden Research Center, San Jose, CA, USA;IBM Almaden Research Center, San Jose, CA, USA

  • Venue:
  • Companion to the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Prevailing wisdom assumes that there are well-defined, effective and efficient methods for accessing Deep Web content. Unfortunately, there are a host of technical and non-technical factors that may call this assumption into question. In this paper, we present the findings from work on a software system, which was commissioned by the British Broadcasting Corporation (BBC). The system requires stable and periodic extraction of Deep Web content from a number of online data sources. The insight from the project brings an important issue to the forefront and under-scores the need for further research into access technology for the Deep Web.