BioCrawler: An intelligent crawler for the semantic web

Authors:
Alexandros Batzios;Christos Dimou;Andreas L. Symeonidis;Pericles A. Mitkas
Affiliations:
Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Greece;Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Greece;Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Greece;Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, Greece
Venue:
Expert Systems with Applications: An International Journal
Year:
2008

Citing 16
Cited 3

Hidden order: how adaptation builds complexity

Hidden order: how adaptation builds complexity
Growing artificial societies: social science from the bottom up

Growing artificial societies: social science from the bottom up
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient crawling through URL ordering

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The shark-search algorithm. An application: tailored Web site mapping

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Simulating the immune system

Computing in Science and Engineering
Information retrieval on the web

ACM Computing Surveys (CSUR)
An adaptive model for optimizing performance of an incremental web crawler

Proceedings of the 10th international conference on World Wide Web
Developing multi-agent systems with a FIPA-compliant agent framework

Software—Practice & Experience
Evaluating topic-driven web crawlers

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Focused Crawling Using Context Graphs

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Design and Implementation of a Distributed Crawler and Filtering Processor

NGITS '02 Proceedings of the 5th International Workshop on Next Generation Information Technologies and Systems
CoBWeb A Crawler for the Brazilian Web

SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
UbiCrawler: a scalable fully distributed web crawler

Software—Practice & Experience
Distributed artificial intelligence and object-oriented modelling of a fishery

Mathematical and Computer Modelling: An International Journal

State of the Art in Semantic Focused Crawlers

ICCSA '09 Proceedings of the International Conference on Computational Science and Its Applications: Part II
Retrieving keyworded subgraphs with graph ranking score

Expert Systems with Applications: An International Journal
Ant colony optimization for RDF chain queries for decision support

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	12.05

Visualization

Abstract

Web crawling has become an important aspect of web search, as the WWW keeps getting bigger and search engines strive to index the most important and up to date content. Many experimental approaches exist, but few actually try to model the current behaviour of search engines, which is to crawl and refresh the sites they deem as important, much more frequently than others. BioCrawler mirrors this behaviour on the semantic web, by applying the learning strategies adopted in previous work on ecosystem simulation, called BioTope. BioCrawler employs the principles of BioTope's intelligent agents on the semantic web, learns which sites are rich in semantic content and which sites link to them and adjusts its crawling habits accordingly. In the end, it learns to behave much like the state of the art search engine crawlers do. However, BioCrawler reaches that behavior solely by exploiting on-page factors, rather than off-page factors, such as the currently used link popularity.