The Cranfield tests on index language devices
Readings in information retrieval
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
An empirical study of smoothing techniques for language modeling
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Minimal test collections for retrieval evaluation
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A statistical method for system evaluation using incomplete judgments
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Estimating average precision with incomplete and imperfect judgments
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
On the robustness of relevance measures with incomplete judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Strategic system comparisons via targeted relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A comparison of pooled and sampled relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance assessment: are judges exchangeable and does it matter
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Score adjustment for correction of pooling bias
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Towards methods for the collective gathering and quality control of relevance assessments
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
On statistical analysis and optimization of information retrieval effectiveness metrics
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
ACM SIGIR Forum
Online stratified sampling: evaluating classifiers at web-scale
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Aspects and analysis of patent test collections
PaIR '10 Proceedings of the 3rd international workshop on Patent information retrieval
Why finding entities in Wikipedia is difficult, sometimes
Information Retrieval
Overview of the INEX 2009 entity ranking track
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Crowdsourcing for search evaluation
ACM SIGIR Forum
Crowdsourcing for search and data mining
ACM SIGIR Forum
ReFER: effective relevance feedback for entity ranking
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Efficiently collecting relevance information from clickthroughs for web retrieval system evaluation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Query modeling for entity search based on terms, categories, and examples
ACM Transactions on Information Systems (TOIS)
Prioritizing relevance judgments to improve the construction of IR test collections
Proceedings of the 20th ACM international conference on Information and knowledge management
Toward interactive training and evaluation
Proceedings of the 20th ACM international conference on Information and knowledge management
A fast MAP adaptation technique for gmm-supervector-based video semantic indexing systems
MM '11 Proceedings of the 19th ACM international conference on Multimedia
Crowdsourcing for information retrieval
ACM SIGIR Forum
Category-based query modeling for entity search
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Cross-Language Latent Relational Search between Japanese and English Languages Using a Web Corpus
ACM Transactions on Asian Language Information Processing (TALIP)
A ranking framework for entity oriented search using Markov random fields
Proceedings of the 1st Joint International Workshop on Entity-Oriented and Semantic Search
Exploiting the category structure of Wikipedia for entity ranking
Artificial Intelligence
Crowdsourcing for information retrieval: introduction to the special issue
Information Retrieval
Large-scale visual concept detection with explicit kernel maps and power mean SVM
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
A mutual information-based framework for the analysis of information retrieval systems
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Inferring conceptual relationships to improve medical records search
Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Learning to handle negated language in medical records search
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
The TREC Medical Records Track
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
A new statistical strategy for pooling: ELI
Information Processing Letters
Choices in batch information retrieval evaluation
Proceedings of the 18th Australasian Document Computing Symposium
Evaluation in Music Information Retrieval
Journal of Intelligent Information Systems
Retina enhanced SURF descriptors for spatio-temporal concept detection
Multimedia Tools and Applications
Semantic concept-enriched dependence model for medical information retrieval
Journal of Biomedical Informatics
Hi-index | 0.00 |
We consider the problem of large scale retrieval evaluation. Recently two methods based on random sampling were proposed as a solution to the extensive effort required to judge tens of thousands of documents. While the first method proposed by Aslam et al. [1] is quite accurate and efficient, it is overly complex, making it difficult to be used by the community, and while the second method proposed by Yilmaz et al., infAP [14], is relatively simple, it is less efficient than the former since it employs uniform random sampling from the set of complete judgments. Further, none of these methods provide confidence intervals on the estimated values. The contribution of this paper is threefold: (1) we derive confidence intervals for infAP, (2) we extend infAP to incorporate nonrandom relevance judgments by employing stratified random sampling, hence combining the efficiency of stratification with the simplicity of random sampling, (3) we describe how this approach can be utilized to estimate nDCG from incomplete judgments. We validate the proposed methods using TREC data and demonstrate that these new methods can be used to incorporate nonrandom samples, as were available in TREC Terabyte track '06.