What's new on the web?: the evolution of the web from a search engine perspective
Proceedings of the 13th international conference on World Wide Web
A comparison of techniques for estimating IDF values to generate lexical signatures for the web
Proceedings of the 10th ACM workshop on Web information and data management
Correlation of Term Count and Document Frequency for Google N-Grams
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Hi-index | 0.00 |
The purpose of this paper is threefold. First, we study the evolution of the web based on data available from an earlier snapshot of the web and compare the results with those predicted in [2]. Secondly, we establish whether the WT10G dataset, a popular benchmark for the development and evaluation of internet based applications is appropriate for the tasks. Finally, is there a need for a collection of a new dataset for such purposes. The findings are that the appropriateness of using the popular WT10G dataset in recent Internet-based experiments is questionable and that there is a need for a new collection of dataset for development and evaluation purposes of algorithms related to Internet search engine developments.