A technique for measuring the relative size and overlap of public Web search engines
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Challenges in web search engines
ACM SIGIR Forum
Towards comprehensive web search
Towards comprehensive web search
The indexable web is more than 11.5 billion pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Random sampling from a search engine's index
Proceedings of the 15th international conference on World Wide Web
Entropy of search logs: how hard is search? with personalization? with backoff?
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Monte Carlo Strategies in Scientific Computing
Monte Carlo Strategies in Scientific Computing
Hi-index | 0.00 |
How many pages on the Web will be accessed by Web users? This is an interesting question for both Web scientists and industry engineers. To answer this question, User Access Web (UA Web) is described and studied in this paper. With analysis on large scale Web users’ access logs, a sampling procedure is proposed to reduce the bias, and the near-uniform random pages are sampled from the UA Web applying search engine interface and Monte Carlo methods. Experimental results on about 675 million user log entries reveal some properties of the UA Web and the indices of four search engines, e.g. power law distribution, average length of pages, index size of search engines, properties of static and dynamic pages, etc.