Evaluating evaluation measure stability
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation by highly relevant documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic evaluation of world wide web search services
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Measuring Search Engine Quality
Information Retrieval
ACM SIGIR Forum
Using titles and category names from editor-driven taxonomies for automatic evaluation
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Proceedings of the 14th ACM international conference on Information and knowledge management
Repeatable evaluation of search services in dynamic environments
ACM Transactions on Information Systems (TOIS)
Deep classifier: automatically categorizing search results into large-scale hierarchies
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Mining world knowledge for analysis of search engine content
Web Intelligence and Agent Systems
Hi-index | 0.00 |
Information retrieval system evaluation is complicated by the need for manually assessed relevance judgments. Large manually-built directories on the web open the door to new evaluation procedures. By assuming that web pages are the known relevant items for queries that exactly match their title, we use the ODP (Open Directory Project) and Looksmart directories for system evaluation. We test our approach with a sample from a log of ten million web queries and show that such an evaluation is unbiased in terms of the directory used, stable with respect to the query set selected, and correlated with a reasonably large manual evaluation.