Machine Learning
The Cranfield tests on index language devices
Readings in information retrieval
How reliable are the results of large-scale information retrieval experiments?
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
The Philosophy of Information Retrieval Evaluation
CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A statistical method for system evaluation using incomplete judgments
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Inferring document relevance via average precision
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating average precision with incomplete and imperfect judgments
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Retrieval evaluation with incomplete relevance data: a comparative study of three measures
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Fast learning of document ranking functions with the committee perceptron
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness
ACM Transactions on Information Systems (TOIS)
Comparing metrics across TREC and NTCIR: the robustness to system bias
Proceedings of the 17th ACM conference on Information and knowledge management
Building a framework for the probability ranking principle by a family of expected weighted rank
ACM Transactions on Information Systems (TOIS)
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
Weighted Rank Correlation in Information Retrieval Evaluation
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Finding `Lucy in Disguise': The Misheard Lyric Matching Problem
AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Measuring the reusability of test collections
Proceedings of the third ACM international conference on Web search and data mining
Towards better evaluation for human language technology
LKR'08 Proceedings of the 3rd international conference on Large-scale knowledge resources: construction and application
Reusable test collections through experimental design
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
On identifying representative relevant documents
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Evaluation effort, reliability and reusability in XML retrieval
Journal of the American Society for Information Science and Technology
Evaluation of information retrieval for E-discovery
Artificial Intelligence and Law
Pseudo test collections for learning web search ranking functions
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Building a web test collection using social media
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Choices in batch information retrieval evaluation
Proceedings of the 18th Australasian Document Computing Symposium
Towards information retrieval evaluation with reduced and only positive judgements
Proceedings of the 18th Australasian Document Computing Symposium
Hi-index | 0.00 |
Information retrieval evaluation based on the pooling method is inherently biased against systems that did not contribute to the pool of judged documents. This may distort the results obtained about the relative quality of the systems evaluated and thus lead to incorrect conclusions about the performance of a particular ranking technique. We examine the magnitude of this effect and explore how it can be countered by automatically building an unbiased set of judgements from the original, biased judgements obtained through pooling. We compare the performance of this method with other approaches to the problem of incomplete judgements, such as bpref, and show that the proposed method leads to higher evaluation accuracy, especially if the set of manual judgements is rich in documents, but highly biased against some systems.