Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Creating a Web community chart for navigating related communities
Proceedings of the 12th ACM conference on Hypertext and Hypermedia
Focused Crawling Using Context Graphs
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Focused Crawls, Tunneling, and Digital Libraries
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
A General Evaluation Framework for Topical Crawlers
Information Retrieval
What's there and what's not?: focused crawling for missing documents in digital libraries
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Focused crawling for both topical relevance and quality of medical information
Proceedings of the 14th ACM international conference on Information and knowledge management
Focused web crawling in the acquisition of comparable corpora
Information Retrieval
A topic-specific web search system focusing on quality pages
ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
A Generalized Links and Text Properties Based Forum Crawler
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Hi-index | 0.00 |
Focused crawling refers to a process of fetching domain-specific pages from the Web. It is an important method to build domain-specific document collections, but it suffers from low recall due to the local nature of crawling algorithms associated with Web's community structure. In this study, we address the problem of limited crawling scope of focused crawling using a result merging approach. The results of crawling processes based on different start URL sets and focused crawling methods were merged. We found that merging improves considerably the effectiveness of focused crawling. The results reported here are based on 10 test topics and 140 crawls in the domains of genomics and genetics.