Focused Crawls, Tunneling, and Digital Libraries
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Hi-index | 0.01 |
Nothing is more distributed than the Web, with its content spread across thousands of servers. High performance hardware and software is essential for an effective download, analysis, and organization of this content. We describe our experience with a highly parallel Web crawlingsystem (Mercator) to construct 驴 automatically 驴 collections of scientific resources for the National Science Digital Library.