Adaptive linear information retrieval models
SIGIR '87 Proceedings of the 10th annual international ACM SIGIR conference on Research and development in information retrieval
Measuring relevance judgements
Information Processing and Management: an International Journal
A cognitive view of the situational dynamism of user-centered relevance estimation
Journal of the American Society for Information Science - Special issue: relevance research
Variations in relevance assessments and the measurement of retrieval effectiveness
Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
Variations in relevance judgments and the measurement of retrieval effectiveness
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Towards the identification of the optimal number of relevance categories
Journal of the American Society for Information Science
IR evaluation methods for retrieving highly relevant documents
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
The concept of relevance in IR
Journal of the American Society for Information Science and Technology
An efficient boosting algorithm for combining preferences
The Journal of Machine Learning Research
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval system evaluation: effort, sensitivity, and reliability
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning to rank: from pairwise approach to listwise approach
Proceedings of the 24th international conference on Machine learning
A regression framework for learning ranking functions using relative relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Here or there: preference judgments for relevance
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Measuring the agreement among relevance judges
MIRA'99 Proceedings of the 1999 international conference on Final Mira
Top-k learning to rank: labeling, ranking and evaluation
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Mining contextual preference rules for building user profiles
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Better than their reputation? on the reliability of relevance assessments with students
CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
Is top-k sufficient for ranking?
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
In the traditional evaluation of information retrieval systems, assessors are asked to determine the relevance of a document on a graded scale, independent of any other documents. Such judgments are absolute judgments. Learning to rank brings some new challenges to this traditional evaluation methodology, especially regarding absolute relevance judgments. Recently preferences judgments have been investigated as an alternative. Instead of assigning a relevance grade to a document, an assessor looks at a pair of pages and judges which one is better. In this paper, we generalize pairwise preference judgments to relative judgments. We formulate the problem of relative judgments in a formal way and then propose a new strategy called Select-the-Best-Ones to solve the problem. Through user studies, we compare our proposed method with a pairwise preference judgment method and an absolute judgment method. The results indicate that users can distinguish by about one more relevance degree when using relative methods than when using the absolute method. Consequently, the relative methods generate 15-30% more document pairs for learning to rank. Compared to the pairwise method, our proposed method increases the agreement among assessors from 95% to 99%, while halving the labeling time and the number of discordant pairs to experts' judgments.