Efficient crawling through URL ordering
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The shark-search algorithm. An application: tailored Web site mapping
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Generalization performance of support vector machines and other pattern classifiers
Advances in kernel methods
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Adaptive Retrieval Agents: Internalizing Local Contextand Scaling up to the Web
Machine Learning - Special issue on information retrieval
Intelligent crawling on the World Wide Web with arbitrary predicates
Proceedings of the 10th international conference on World Wide Web
On the design of a learning crawler for topical resource discovery
ACM Transactions on Information Systems (TOIS)
Accelerated focused crawling through online relevance feedback
Proceedings of the 11th international conference on World Wide Web
Training Invariant Support Vector Machines
Machine Learning
Using Reinforcement Learning to Spider the Web Efficiently
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Focused Crawling Using Context Graphs
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
A note on the utility of incremental learning
AI Communications
An overview of statistical learning theory
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Topic-driven Web Crawler or focused crawler is the key tool of on-line web information library. It's a challenging issue that how to achieve good performance efficiently with limited time and space resources. This paper proposes a focused web crawler wHunter that implements incremental and multi-strategy learning by taking the advantages of both SVM (support vector machines) and naïve Bayes. On the one hand, the initial performance is guaranteed via SVM classifier; on the other hand, when enough web pages are obtained, the classifier is switched to naïve Bayes so that on-line incremental learning is achieved. Experimental results show that our proposed algorithm is efficient and easy to implement.