Domain-Specific Deep Web Sources Discovery

Authors:
Ying Wang;Wanli Zuo;Tao Peng;Fengling He
Affiliations:
-;-;-;-
Venue:
ICNC '08 Proceedings of the 2008 Fourth International Conference on Natural Computation - Volume 05
Year:
2008

Citing 0
Cited 5

An enhanced swarm intelligence clustering-based RBFNN classifier and its application in deep Web sources classification

Frontiers of Computer Science in China
Deep web sources classifier based on DSOM-EACO clustering model

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
A QIIIEP based domain specific hidden web crawler

Proceedings of the International Conference & Workshop on Emerging Trends in Technology
E-FFC: an enhanced form-focused crawler for domain-specific deep web databases

Journal of Intelligent Information Systems
A Novel Architecture for Deep Web Crawler

International Journal of Information Technology and Web Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The web has been rapidly deepened with myriad searchable databases online, where data are hidden behind query interfaces. However, users often have difficulties in finding the right sources and then querying over them in myriad useful databases online. For solving this problem, this paper presents a new method by importing focused crawling technology to automatically accomplish deep web sources discovery. Firstly, locate web sites for Domain-Specific data sources based on focused crawling. Secondly, judge whether the web site exists deep web query interface in the former three depths. Lastly, judge whether the deep web query interface is relevant to a given topic. Importing focused crawling technology makes the identification of deep web query interface locate in a specific domain and capture relative pages to a given topic instead of pursuing high overlay ratios. This method has dramatically reduced the quantity of pages for the crawler to identify deep web query interfaces.