UbiCrawler: a scalable fully distributed web crawler
Software—Practice & Experience
The language observatory project (LOP)
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Hi-index | 0.00 |
As part of the Language Observatory Project [4], we have been crawling all the web space since 2004. We have collected terabytes of data mostly from Asian and African ccTLDs. In this paper, we present results of the current status of the African web and compare it with its status in 2004 and 2002. This paper focuses on the accessibility of the web pages, the web tree growth, web technology, privacy protection, and web interconnection.