Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
The Cranfield tests on index language devices
Readings in information retrieval
21st Annual ACM/SIGIR International Conference on Research and Development in Information Retrieval
Efficient construction of large test collections
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
How reliable are the results of large-scale information retrieval experiments?
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating evaluation measure stability
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Ranking retrieval systems without relevance judgments
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation by highly relevant documents
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
On Collection Size and Retrieval Effectiveness
Information Retrieval
On the effectiveness of evaluating retrieval systems in the absence of relevance judgments
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Automatic ranking of retrieval systems in imperfect environments
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A unified model for metasearch, pooling, and system evaluation
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
An empirical study of smoothing techniques for language modeling
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The maximum entropy method for analyzing retrieval measures
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Reliable information retrieval evaluation with incomplete and biased judgements
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
On the robustness of relevance measures with incomplete judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Strategic system comparisons via targeted relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A comparison of pooled and sampled relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Inexpensive fusion methods for enhancing feature detection
Image Communication
Inferring document relevance from incomplete information
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Semiautomatic evaluation of retrieval systems using document similarities
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Retrieval sensitivity under training using different measures
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A new rank correlation coefficient for information retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A simple and efficient sampling method for estimating AP and NDCG
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation over thousands of queries
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Relevance assessment: are judges exchangeable and does it matter
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Comparing metrics across TREC and NTCIR:: the robustness to pool depth bias
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Estimating average precision when judgments are incomplete
Knowledge and Information Systems
Statistical power in retrieval experimentation
Proceedings of the 17th ACM conference on Information and knowledge management
Comparing metrics across TREC and NTCIR: the robustness to system bias
Proceedings of the 17th ACM conference on Information and knowledge management
Multi-cue fusion for semantic video indexing
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Survey and evaluation of query intent detection methods
Proceedings of the 2009 workshop on Web Search Click Data
Nullification test collections for web spam and SEO
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Measuring the Search Effectiveness of a Breadth-First Crawl
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Score adjustment for correction of pooling bias
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Document selection methodologies for efficient and effective learning-to-rank
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Including summaries in system evaluation
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Improving Automatic Video Retrieval with Semantic Concept Detection
SCIA '09 Proceedings of the 16th Scandinavian Conference on Image Analysis
Building a framework for the probability ranking principle by a family of expected weighted rank
ACM Transactions on Information Systems (TOIS)
A few good topics: Experiments in topic set reduction for retrieval evaluation
ACM Transactions on Information Systems (TOIS)
Episode-constrained cross-validation in video concept retrieval
IEEE Transactions on Multimedia
Measuring the reusability of test collections
Proceedings of the third ACM international conference on Web search and data mining
Exploiting external knowledge to improve video retrieval
Proceedings of the international conference on Multimedia information retrieval
Proceedings of the international conference on Multimedia information retrieval
The Pascal Visual Object Classes (VOC) Challenge
International Journal of Computer Vision
Classifier fusion for SVM-based multimedia semantic indexing
ECIR'07 Proceedings of the 29th European conference on IR research
The importance of anchor text for ad hoc search revisited
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
The effect of assessor error on IR system evaluation
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Extending average precision to graded relevance judgments
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
PRES: a score metric for evaluating recall-oriented information retrieval applications
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Assessor error in stratified evaluation
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Why finding entities in Wikipedia is difficult, sometimes
Information Retrieval
Refining video annotation by exploiting inter-shot context
Proceedings of the international conference on Multimedia
CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Evaluation effort, reliability and reusability in XML retrieval
Journal of the American Society for Information Science and Technology
Evaluation of information retrieval for E-discovery
Artificial Intelligence and Law
Evaluating multi-query sessions
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
The effects of choice in routing relevance judgments
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Prioritizing relevance judgments to improve the construction of IR test collections
Proceedings of the 20th ACM international conference on Information and knowledge management
Optimizing the cost of information retrieval testcollections
Proceedings of the 4th workshop on Workshop for Ph.D. students in information & knowledge management
Mining concept relationship in temporal context for effective video annotation
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Collaborative video reindexing via matrix factorization
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Visual vocabulary optimization with spatial context for image annotation and classification
MMM'12 Proceedings of the 18th international conference on Advances in Multimedia Modeling
On smoothing average precision
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Large vocabulary quantization for searching instances from videos
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Combining inverted indices and structured search for ad-hoc object retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
An uncertainty-aware query selection model for evaluation of IR systems
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Temporal-Spatial refinements for video concept fusion
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part III
A mutual information-based framework for the analysis of information retrieval systems
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
On Using Fewer Topics in Information Retrieval Evaluations
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Query-Performance Prediction Using Minimal Relevance Feedback
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Document Score Distribution Models for Query Performance Inference and Prediction
ACM Transactions on Information Systems (TOIS)
Multimedia Tools and Applications
Evaluation in Music Information Retrieval
Journal of Intelligent Information Systems
Retina enhanced SURF descriptors for spatio-temporal concept detection
Multimedia Tools and Applications
Semantic concept-enriched dependence model for medical information retrieval
Journal of Biomedical Informatics
Hi-index | 0.00 |
We consider the problem of evaluating retrieval systems using incomplete judgment information. Buckley and Voorhees recently demonstrated that retrieval systems can be efficiently and effectively evaluated using incomplete judgments via the bpref measure [6]. When relevance judgments are complete, the value of bpref is an approximation to the value of average precision using complete judgments. However, when relevance judgments are incomplete, the value of bpref deviates from this value, though it continues to rank systems in a manner similar to average precision evaluated with a complete judgment set. In this work, we propose three evaluation measures that (1) are approximations to average precision even when the relevance judgments are incomplete and (2) are more robust to incomplete or imperfect relevance judgments than bpref. The proposed estimates of average precision are simple and accurate, and we demonstrate the utility of these estimates using TREC data.