The shark-search algorithm. An application: tailored Web site mapping
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Accelerated focused crawling through online relevance feedback
Proceedings of the 11th international conference on World Wide Web
Ontology-focused crawling of Web documents
Proceedings of the 2003 ACM symposium on Applied computing
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Exploiting Interclass Rules for Focused Crawling
IEEE Intelligent Systems
Web directory construction using lexical chains
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Focused crawling of tagged web resources using ontology
Computers and Electrical Engineering
Hi-index | 0.00 |
In this paper we present a novel approach for building a focused crawler. The goal of our crawler is to effectively identify web pages that relate to a set of pre-defined topics and download them regardless of their web topology or connectivity with other popular pages on the web. The main challenges that we address in our study are: (i) how to effectively identify the pages' topical content before these are fully downloaded and processed and (ii) how to obtain a well-balanced set of training examples that the crawler will regularly consult in its subsequent web visits.