Accessing the deep web: when good ideas go bad

Authors:
Alfredo Alba;Varun Bhagwan;Tyrone Grandison
Affiliations:
IBM Almaden Research Center, San Jose, CA, USA;IBM Almaden Research Center, San Jose, CA, USA;IBM Almaden Research Center, San Jose, CA, USA
Venue:
Companion to the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Year:
2008

Citing 2
Cited 7

Koala: capture, share, automate, personalize business processes on the web

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Sound index: charts for the people, by the people

Communications of the ACM - The Status of the P versus NP Problem

Sound index: charts for the people, by the people

Communications of the ACM - The Status of the P versus NP Problem
R2M: a reputation model for mashups

CCNC'10 Proceedings of the 7th IEEE conference on Consumer communications and networking conference
Multimodal social intelligence in a real-time dashboard system

The VLDB Journal — The International Journal on Very Large Data Bases
OXPath: little language, little memory, great value

Proceedings of the 20th international conference companion on World wide web
The OXPath to success in the deep web

Proceedings of the 20th international conference companion on World wide web
Free-text search over complex web forms

IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
OXPath: A language for scalable data extraction, automation, and crawling on the deep web

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Prevailing wisdom assumes that there are well-defined, effective and efficient methods for accessing Deep Web content. Unfortunately, there are a host of technical and non-technical factors that may call this assumption into question. In this paper, we present the findings from work on a software system, which was commissioned by the British Broadcasting Corporation (BBC). The system requires stable and periodic extraction of Deep Web content from a number of online data sources. The insight from the project brings an important issue to the forefront and under-scores the need for further research into access technology for the Deep Web.