Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Accelerated focused crawling through online relevance feedback
Proceedings of the 11th international conference on World Wide Web
Focused Crawling Using Context Graphs
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Subgroup discovery techniques and applications
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
Improving the performance of focused web crawlers
Data & Knowledge Engineering
Hi-index | 0.01 |
This paper reports a new general framework of focused web crawling based on "relational subgroup discovery". Predicates are used explicitly to represent the relevance clues of those unvisited pages in the crawl frontier, and then first-order classification rules are induced using subgroup discovery technique. The learned relational rules with sufficient support and confidence will guide the crawling process afterwards. We present the many interesting features of our proposed first-order focused crawler, together with preliminary promising experimental results.