So many topics, so little time

Authors:
Giovanna Roda;Veronika Zenz;Mihai Lupu;Kalervo Järvelin;Mark Sanderson;Christa Womser-Hacker
Affiliations:
Matrixware;Matrixware;Information Retrieval Facility;University of Tampere;University of Shefield;University of Hildesheim
Venue:
ACM SIGIR Forum
Year:
2009

Citing 7
Cited 2

How reliable are the results of large-scale information retrieval experiments?

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The effect of topic set size on retrieval experiment error

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval system evaluation: effort, sensitivity, and reliability

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Creating a test collection for citation-based IR experiments

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Hypothesis testing with incomplete relevance judgments

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
On information retrieval metrics designed for evaluation with incomplete relevance assessments

Information Retrieval
Discovering key concepts in verbose queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

A vector space analysis of swedish patent claims with different linguistic indices

PaIR '10 Proceedings of the 3rd international workshop on Patent information retrieval
CLEF-IP 2009: retrieval experiments in the intellectual property domain

CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the context of creating large scale test collections, the present paper discusses methods of constructing a patent test collection for evaluation of prior art search. In particular, it addresses criteria for topic selection and identification of recall bases. These issues arose while organizing the CLEF-IP evaluation track and were the subject of an online discussion among the track's organizers and its steering committee. Most literature on building test collections is concerned with minimizing the costs of obtaining relevance assessments. CLEF-IP can afford to have large topics sets since relevance assessments are generated by exploiting existing manually created information. In a cost-benefit analysis, the only issue seems to be the computing time required by participants to run (tens or hundreds of) thousands of queries. This document describes the data sets and decisions leading to the creation of the CLEF-IP collection.