CoBWeb A Crawler for the Brazilian Web

Authors:
Altigran S. da Silva;Eveline A. Veloso;Paulo B. Golghe;Berthier Ribeiro-Neto;Alberto H. F. Laender;Nivio Ziviani
Affiliations:
-;-;-;-;-;-
Venue:
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Year:
1999

Citing 0
Cited 15

Link-based and content-based evidential information in a belief network model

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Local versus global link information in the Web

ACM Transactions on Information Systems (TOIS)
Combining link-based and content-based methods for web document classification

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Crawling a country: better strategies than breadth-first for web page ordering

WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Impedance coupling in content-targeted advertising

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Set-based vector model: An efficient approach for correlation-based ranking

ACM Transactions on Information Systems (TOIS)
A comparative study of citations and links in document classification

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Multi-evidence, multi-criteria, lazy associative document classification

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Characterization of national Web domains

ACM Transactions on Internet Technology (TOIT)
A cost-effective method for detecting web site replicas on search engine databases

Data & Knowledge Engineering
Development of an agent system to collect schedule information on the web for intermodal transportation network planning

CEA'07 Proceedings of the 2007 annual Conference on International Conference on Computer Engineering and Applications
BioCrawler: An intelligent crawler for the semantic web

Expert Systems with Applications: An International Journal
CUCWeb: a Catalan corpus built from the web

WAC '06 Proceedings of the 2nd International Workshop on Web as Corpus
Design and implement a web news retrieval system

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Using site-level connections to estimate link confidence

Journal of the American Society for Information Science and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the key components of current Web search engines is the document collector. This paper describes CoBWeb, an automatic document collector, whose architecture is distributed and highly scalable. CoBWeb aims at collecting large amounts of documents per time period, while observing operational and ethical limits in the crawling process. CoBWeb is part of the SIAM (Information Systems in Mobile Computing Environments) search engine which is being implemented to support the Brazilian Web. Thus, several results related to the Brazilian Web are presented.