Web Recency Maintenance Protocol
IWDC '02 Proceedings of the 4th International Workshop on Distributed Computing, Mobile and Wireless Computing
Agents, Crawlers, and Web Retrieval
CIA '02 Proceedings of the 6th International Workshop on Cooperative Information Agents VI
Hi-index | 0.00 |
In this paper, we study how to keep the Internet search engines updated with the changes occurring at the various web servers in the Internet. Whereas previous work focuses mainly on approaches in which web search engines poll the web servers on a per-URL basis for obtaining update information, our work focuses on an approach in which web servers can provide a list of modified URLs since the last poll by a search engine. %refreshing copies by updates from the web servers to the search engines. We also develop new metrics for measuring the freshness of data at the web servers. The primary motivation for our work is the existence of new search engines which provide cached data to users. Access to such cached data can be much faster if the server for the URL in question is busy or unavailable or present overseas.