A large-scale study of the evolution of web pages
WWW '03 Proceedings of the 12th international conference on World Wide Web
What's new on the web?: the evolution of the web from a search engine perspective
Proceedings of the 13th international conference on World Wide Web
Information re-retrieval: repeat queries in Yahoo's logs
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
The web changes everything: understanding the dynamics of web content
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Evolution of web search results within years
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Adaptive time-to-live strategies for query result caching in web search engines
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Due to the dynamic nature of web and the complex architectures of modern commercial search engines, top results in major search engines can change dramatically over time. Our experimental data shows that, for all three major search engines (Google, Bing and Yahoo!), approximately 90% of queries have their top 10 results altered within a period of ten days. Although this instability is expected in some situations such as in news-related queries, it is problematic in general because it can dramatically affect retrieval performance measurements and negatively affect users' perception of search quality (for instance, when users cannot re-find a previously found document). In this work we present the first large scale study on the degree and nature of these changes. We introduce several types of query instability, and several metrics to quantify it.We then present a quantitative analysis using 12,600 queries collected from a commercial web search engine over several weeks. Our analysis shows that the results from all major search engines have similar levels of instability, and that many of these changes are temporary. We also identified classes of queries with clearly different instability profiles - for instance, navigational queries are considerably more stable than non-navigational, while longer queries are significantly less stable than shorter ones.