Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Intelligent crawling on the World Wide Web with arbitrary predicates
Proceedings of the 10th international conference on World Wide Web
Machine Learning
Using Reinforcement Learning to Spider the Web Efficiently
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Focused Crawling Using Context Graphs
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Ontology-focused crawling of Web documents
Proceedings of the 2003 ACM symposium on Applied computing
An Efficient Adaptive Focused Crawler Based on Ontology Learning
HIS '05 Proceedings of the Fifth International Conference on Hybrid Intelligent Systems
MedicoPort: A medical search engine for all
Computer Methods and Programs in Biomedicine
Design and implementation of contextual information portals
Proceedings of the 20th international conference companion on World wide web
Focused crawling of tagged web resources using ontology
Computers and Electrical Engineering
Hi-index | 0.00 |
Focused crawling is proposed to selectively seek out pages that are relevant to a predefined set of topics. Since an ontology is a well-formed knowledge representation, ontology-based focused crawling approaches have come into research. However, since these approaches apply manually predefined concept weights to calculate the relevance scores of web pages, it is difficult to acquire the optimal concept weights to maintain a stable harvest rate during the crawling process. To address this issue, we propose a learnable focused crawling approach based on ontology. An ANN (Artificial Neural Network) is constructed by using a domain-specific ontology and applied to the classification of web pages. Experiments have been performed, and the results show that our approach outperforms the breadth-first search crawling approach, the simple keyword-based crawling approach, and the focused crawling approach using only the domain-specific ontology.