Liberal relevance criteria of TREC -: counting on negligible documents?

Authors:
Eero Sormunen
Affiliations:
University of Tampere, Finland
Venue:
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2002

Citing 8
Cited 51

An evaluation of retrieval effectiveness for a full-text document-retrieval system

Communications of the ACM
Variations in relevance judgments and the evaluation of retrieval performance

Information Processing and Management: an International Journal
How reliable are the results of large-scale information retrieval experiments?

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Finding information on the World Wide Web: the retrieval effectiveness of search engines

Information Processing and Management: an International Journal
IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness

Information Processing and Management: an International Journal
Evaluation by highly relevant documents

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval

Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Using graded relevance assessments in IR evaluation

Journal of the American Society for Information Science and Technology
Measuring retrieval effectiveness: a new proposal and a first experimental validation

Journal of the American Society for Information Science and Technology
The SST method: a tool for analysing web information search processes

Information Processing and Management: an International Journal
The influence of relevance levels on the effectiveness of interactive information retrieval

Journal of the American Society for Information Science and Technology
Binary and graded relevance in IR evaluations: comparison of the effects on ranking of IR systems

Information Processing and Management: an International Journal
An algorithm to cluster documents based on relevance

Information Processing and Management: an International Journal
Evaluation of resources for question answering evaluation

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical clustering of a Finnish newspaper article collection with graded relevance assessments

Information Retrieval
Building a reusable test collection for question answering

Journal of the American Society for Information Science and Technology - Research Articles
Corpus-based cross-language information retrieval in retrieval of highly relevant documents: Research Articles

Journal of the American Society for Information Science and Technology
An analysis of two approaches in information retrieval: From frameworks to study designs

Journal of the American Society for Information Science and Technology
Indexing strategies for Swedish full text retrieval under different user scenarios

Information Processing and Management: an International Journal
Effects of highly agreed documents in relevancy prediction

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Semantic components enhance retrieval of domain-specific documents

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Query-level loss functions for information retrieval

Information Processing and Management: an International Journal
Evaluating the effectiveness of relevance feedback based on a user simulation model: effects of a user scenario on cumulated gain value

Information Retrieval
On information retrieval metrics designed for evaluation with incomplete relevance assessments

Information Retrieval
Relevance assessment: are judges exchangeable and does it matter

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Intuition-supporting visualization of user's performance based on explicit negative higher-order relevance

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Relevance judgments between TREC and Non-TREC assessors

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Toward automatic facet analysis and need negotiation: Lessons from mediated search

ACM Transactions on Information Systems (TOIS)
Comparing metrics across TREC and NTCIR: the robustness to system bias

Proceedings of the 17th ACM conference on Information and knowledge management
Including summaries in system evaluation

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective

Information Systems
Using semantic components to search for domain-specific documents: An evaluation from the system perspective and the user perspective

Information Systems
Methods for Evaluating Interactive Information Retrieval Systems with Users

Foundations and Trends in Information Retrieval
Interactive relevance feedback with graded relevance and sentence extraction: simulated user experiments

Proceedings of the 18th ACM conference on Information and knowledge management
Metric and Relevance Mismatch in Retrieval Evaluation

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Test Collection-Based IR Evaluation Needs Extension toward Sessions --- A Case of Extremely Short Queries

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
An algorithm to cluster documents based on relevance

Information Processing and Management: an International Journal
Binary and graded relevance in IR evaluations-Comparison of the effects on ranking of IR systems

Information Processing and Management: an International Journal
Assessors' search result satisfaction associated with relevance in a scientific domain

Proceedings of the third symposium on Information interaction in context
Physicists' information tasks: structure, length and retrieval performance

Proceedings of the third symposium on Information interaction in context
Does degree of work task completion influence retrieval performance?

Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem - Volume 47
Simulating simple and fallible relevance feedback

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Evaluating diversified search results using per-intent graded relevance

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Developing a test collection for the evaluation of integrated search

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Dictionary-based CLIR loses highly relevant documents

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Interactive searching behavior with structured XML documents

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval
The effects of relevance feedback quality and quantity in interactive relevance feedback: a simulation based on user modeling

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Experiments on average distance measure

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
A multimedia retrieval framework based on automatic graded relevance judgments

MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
Time drives interaction: simulating sessions in diverse searching environments

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
An exploratory study into perceived task complexity, topic specificity and usefulness for integrated search

Proceedings of the 4th Information Interaction in Context Symposium
Using crowdsourcing for TREC relevance assessment

Information Processing and Management: an International Journal
How doctors search: A study of query behaviour and the impact on search results

Information Processing and Management: an International Journal
Cumulated relative position: a metric for ranking evaluation

CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
The effect of threshold priming and need for cognition on relevance calibration and assessment

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Is relevance hard work?: evaluating the effort of making relevant assessments

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Modeling behavioral factors ininteractive information retrieval

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most test collections (like TREC and CLEF) for experimental research in information retrieval apply binary relevance assessments. This paper introduces a four-point relevance scale and reports the findings of a project in which TREC-7 and TREC-8 document pools on 38 topics were reassessed. The goal of the reassessment was to build a subcollection of TREC for experiments on highly relevant documents and to learn about the assessment process as well as the characteristics of a multigraded relevance corpus.Relevance criteria were defined so that a distinction was made between documents rich in topical information (relevant and highly relevant documents) and poor in topical information (marginally relevant documents). It turned out that about 50% of documents assessed as relevant were regarded as marginal. The characteristics of the relevance corpus and lessons learned from the reassessment project are discussed. The need to develop more elaborated relevance assessment schemes is emphasized.