Overview of the INEX 2010 book track: scaling up the evaluation using crowdsourcing

Authors:
Gabriella Kazai;Marijn Koolen;Jaap Kamps;Antoine Doucet;Monica Landoni
Affiliations:
Microsoft Research, United Kingdom;University of Amsterdam, Netherlands;University of Amsterdam, Netherlands;University of Caen, France;University of Lugano
Venue:
INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Year:
2010

Citing 13
Cited 6

Crowdsourcing for relevance evaluation

ACM SIGIR Forum
Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business

Crowdsourcing: Why the Power of the Crowd Is Driving the Future of Business
Towards methods for the collective gathering and quality control of relevance assessments

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
ICDAR 2009 Book Structure Extraction Competition

ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Crowdsourcing document relevance assessment with Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Setting up a competition framework for the evaluation of structure extraction from OCR-ed books

International Journal on Document Analysis and Recognition - Special Issue on Performance Evaluation
Human computation: a survey and taxonomy of a growing field

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
LIA at INEX 2010 book track

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
The book structure extraction competition with the resurgence software for part and chapter detection at Caen university

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Focus and element length for book and wikipedia retrieval

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Combining page scores for XML book retrieval

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
OUC's participation in the 2010 INEX book track

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval

Report on INEX 2010

ACM SIGIR Forum
Crowdsourcing for book search evaluation: impact of hit design on comparative system ranking

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
LIA at INEX 2010 book track

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Worker types and personality traits in crowdsourcing relevance labels

Proceedings of the 20th ACM international conference on Information and knowledge management
Understanding book search behavior on the web

Proceedings of the 21st ACM international conference on Information and knowledge management
Searching online book documents and analyzing book citations

Proceedings of the 2013 ACM symposium on Document engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of the INEX Book Track is to evaluate approaches for supporting users in searching, navigating and reading the full texts of digitized books. The investigation is focused around four tasks: 1) Best Books to Reference, 2) Prove It, 3) Structure Extraction, and 4) Active Reading. In this paper, we report on the setup and the results of these tasks in 2010. The main outcome of the track lies in the changes to the methodology for constructing the test collection for the evaluation of the Best Books and Prove It search tasks. In an effort to scale up the evaluation, we explored the use of crowdsourcing both to create the test topics and then to gather the relevance labels for the topics over a corpus of 50k digitized books. The resulting test collection construction methodology combines editorial judgments contributed by INEX participants with crowdsourced relevance labels. We provide an analysis of the crowdsourced data and conclude that - with appropriate task design - crowdsourcing does provide a suitable framework for the evaluation of book search approaches.