The TREC interactive track: an annotated bibliography
Information Processing and Management: an International Journal - Special issue on interactivity at the text retrieval conference (TREC)
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Crowdsourcing for relevance evaluation
ACM SIGIR Forum
Financial incentives and the "performance of crowds"
Proceedings of the ACM SIGKDD Workshop on Human Computation
Methods for Evaluating Interactive Information Retrieval Systems with Users
Foundations and Trends in Information Retrieval
Who are the crowdworkers?: shifting demographics in mechanical turk
CHI '10 Extended Abstracts on Human Factors in Computing Systems
Quality management on Amazon Mechanical Turk
Proceedings of the ACM SIGKDD Workshop on Human Computation
Crowdsourcing document relevance assessment with Mechanical Turk
CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Analyzing the Amazon Mechanical Turk marketplace
XRDS: Crossroads, The ACM Magazine for Students - Comp-YOU-Ter
An evaluation framework for plagiarism detection
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A methodology for evaluating aggregated search results
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Design and implementation of relevance assessments using crowdsourcing
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
In search of quality in crowdsourcing for search engine evaluation
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Explicit search result diversification through sub-queries
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Crowdsourcing for information retrieval: introduction to the special issue
Information Retrieval
Hi-index | 0.00 |
In the field of information retrieval (IR), researchers and practitioners are often faced with a demand for valid approaches to evaluate the performance of retrieval systems. The Cranfield experiment paradigm has been dominant for the in-vitro evaluation of IR systems. Alternative to this paradigm, laboratory-based user studies have been widely used to evaluate interactive information retrieval (IIR) systems, and at the same time investigate users' information searching behaviours. Major drawbacks of laboratory-based user studies for evaluating IIR systems include the high monetary and temporal costs involved in setting up and running those experiments, the lack of heterogeneity amongst the user population and the limited scale of the experiments, which usually involve a relatively restricted set of users. In this paper, we propose an alternative experimental methodology to laboratory-based user studies. Our novel experimental methodology uses a crowdsourcing platform as a means of engaging study participants. Through crowdsourcing, our experimental methodology can capture user interactions and searching behaviours at a lower cost, with more data, and within a shorter period than traditional laboratory-based user studies, and therefore can be used to assess the performances of IIR systems. In this article, we show the characteristic differences of our approach with respect to traditional IIR experimental and evaluation procedures. We also perform a use case study comparing crowdsourcing-based evaluation with laboratory-based evaluation of IIR systems, which can serve as a tutorial for setting up crowdsourcing-based IIR evaluations.