Discovering the biomedical deep web

Authors:
Rajesh Ramanand;King-Ip Lin
Affiliations:
Department of Computer Science, The University of Memphis, Memphis, TN;Department of Computer Science, The University of Memphis, Memphis, TN
Venue:
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Year:
2005

Citing 2
Cited 0

Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Crawling the Hidden Web

Proceedings of the 27th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The rapid growth of biomedical information in the Deep Web has produced unprecedented challenges for traditional search engines. This paper describes a new Deep web resource discovery system for biomedical information. We designed two hypertext mining applications: a Focused Crawler that selectively seeks out relevant pages using a classifier that evaluates the relevance of the document with respect to biomedical information, and a Query Interface Extractor that extracts information from the page to detect the presence of a Deep Web database. Our anecdotes suggest that combining focused crawling with query interface extraction is very effective for building high-quality collections of Deep Web resources on biomedical topics.