A cross-language focused crawling algorithm based on multiple relevance prediction strategies
Computers & Mathematics with Applications
Topical web crawling using weighted anchor text and web page change detection techniques
WSEAS Transactions on Information Science and Applications
E-FFC: an enhanced form-focused crawler for domain-specific deep web databases
Journal of Intelligent Information Systems
Hi-index | 0.00 |
The rapid growth of the world-wide web poses unprecedented scaling challenges for general-purpose crawlers and search engines. A focused crawler aims at selectively seek out pages that are relevant to a pre-defined set of topics. Besides specifying topics by some keywords, it is customary also to use some exemplary documents to compute the similarity of a given web document to the topic. In this paper we introduce a new hybride focused crawler, which uses link structure of documents as well as similarity of pages to the topic to crawl the web