How evaluator domain expertise affects search result relevance judgments

Authors:
Kenneth A. Kinney;Scott B. Huffman;Juting Zhai
Affiliations:
Google, Inc., Mountain View, CA, USA;Google, Inc., Mountain View, CA, USA;Google, Inc., Mountain View, CA, USA
Venue:
Proceedings of the 17th ACM conference on Information and knowledge management
Year:
2008

Citing 8
Cited 6

Overview of the first TREC conference

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Web search behavior of Internet experts and newbies

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Changes of search terms and tactics while writing a research proposal A longitudinal case study

Information Processing and Management: an International Journal
The effects of domain knowledge on search tactic formulation

Journal of the American Society for Information Science and Technology
Modeling successful performance in Web searching

Journal of the American Society for Information Science and Technology
Knowledge in the head and on the web: using topic expertise to aid search

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Intentional query suggestion: making user goals more explicit during search

Proceedings of the 2009 workshop on Web Search Click Data
The effect of assessor error on IR system evaluation

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Design and implementation of relevance assessments using crowdsourcing

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
An analysis of systematic judging errors in information retrieval

Proceedings of the 21st ACM international conference on Information and knowledge management
Differences in search engine evaluations between query owners and non-owners

Proceedings of the sixth ACM international conference on Web search and data mining
Implementing crowdsourcing-based relevance experimentation: an industrial perspective

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional search evaluation approaches have often relied on domain experts to evaluate results for each query. Unfortunately, the range of topics present in any representative sample of web queries makes it impractical to have expert evaluators for every topic. In this paper, we investigate the effect of using "generalist" evaluators instead of experts in the domain of queries being evaluated. Empirically, we ind that for queries drawn from domains requiring high expertise, (1) generalists tend to give shallow, inaccurate ratings as compared to experts. (2) Further experiments show that generalists disagree on the underlying meaning of these queries significantly more often than experts, and often appear to "give up'' and fall back on surface features such as keyword matching. (3) Finally, by estimating the percentage of "expertise requiring'' queries in a web query sample, we estimate the impact of using generalists, versus the ideal of having domain experts for every "expertise requiring'' query.