Exploring cost-effective approaches to human evaluation of search engine relevance

Authors:
Kamal Ali;Chi-Chao Chang;Yunfang Juan
Affiliations:
Yahoo Search, Sunnyvale, CA;Yahoo Search, Sunnyvale, CA;Yahoo Search, Sunnyvale, CA
Venue:
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Year:
2005

Citing 11
Cited 3

The significance of the Cranfield tests on index languages

SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance assessments and the measurement of retrieval effectiveness

Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
How reliable are the results of large-scale information retrieval experiments?

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Evaluating evaluation measure stability

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Does “authority” mean quality? predicting expert quality ratings of Web documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
The Philosophy of Information Retrieval Evaluation

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Scaling IR-system evaluation using term relevance sets

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieval evaluation with incomplete information

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Negotiating a multidimensional framework for relevance space

MIRA'99 Proceedings of the 1999 international conference on Final Mira

Y!Q: contextual search at the point of inspiration

Proceedings of the 14th ACM international conference on Information and knowledge management
Searching with context

Proceedings of the 15th international conference on World Wide Web
On the robustness of relevance measures with incomplete judgments

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we examine novel and less expensive methods for search engine evaluation that do not rely on document relevance judgments. These methods, described within a proposed framework, are motivated by the increasing focus on search results presentation, by the growing diversity of documents and content sources, and by the need to measure effectiveness relative to other search engines. Correlation analysis of the data obtained from actual tests using a subset of the methods in the framework suggest that these methods measure different aspects of the search engine. In practice, we argue that the selection of the test method is a tradeoff between measurement intent and cost.