Object-Extraction-Based Hidden Web Information Retrieval

Authors:
Song Hui;Zhang Ling;Ye Yunming;Ma Fanyuan
Affiliations:
-;-;-;-
Venue:
WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
Year:
2002

Citing 7
Cited 0

Effective Web data extraction with standard XML technologies

Proceedings of the 10th international conference on World Wide Web
Probe, count, and classify: categorizing hidden web databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mercator: A scalable, extensible Web crawler

World Wide Web
Building Light-Weight Wrappers for Legacy Web Data-Sources Using W4F

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Crawling the Hidden Web

Proceedings of the 27th International Conference on Very Large Data Bases
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
A Fully Automated Object Extraction System for the World Wide Web

ICDCS '01 Proceedings of the The 21st International Conference on Distributed Computing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional search engines ignore the tremendous amount information "hidden" behind search forms of Web pages, in large searchable electronic databases, which is called hidden Web. In this paper, we address this problem of designing a system for extracting and retrieval hidden Web information. We present a generic operational model of the hidden Web information retrieval and describe the key techniques. We introduce a new Tag-Tree-based Object Extraction Technique for automatically extracting hidden Web information from web pages. Based on this technique, we implement the retrieval algorithm for structured query of hidden Web information. The test results have also been reported.