Randomization tests
Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Non-parametric significance tests of retrieval performance comparisons
Journal of Information Science
Empirical methods for artificial intelligence
Empirical methods for artificial intelligence
Statistical inference in retrieval effectiveness evaluation
Information Processing and Management: an International Journal
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator
ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on uniform random number generation
Information Retrieval
Information retrieval system evaluation: effort, sensitivity, and reliability
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating evaluation metrics based on the bootstrap
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical precision of information retrieval evaluation
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Validity and power of t-test for comparing MAP and GMAP
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Hypothesis testing with incomplete relevance judgments
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Score standardization for inter-collection comparison of retrieval systems
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Statistical power in retrieval experimentation
Proceedings of the 17th ACM conference on Information and knowledge management
Book search: indexing the valuable parts
Proceedings of the 2008 ACM workshop on Research advances in large digital book repositories
A complex network approach to text summarization
Information Sciences: an International Journal
An Ontology-Based Framework for Knowledge Retrieval
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Computers in Biology and Medicine
Using the Web as corpus for self-training text categorization
Information Retrieval
Regression Rank: Learning to Meet the Opportunity of Descriptive Queries
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
On rank correlation and the distance between rankings
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
An improved markov random field model for supporting verbose queries
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Coupling semi-supervised learning of categories and relations
SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Improvements that don't add up: ad-hoc retrieval results since 1998
Proceedings of the 18th ACM conference on Information and knowledge management
Exploiting bilingual information to improve web search
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Scholarly paper recommendation via user's recent research interests
Proceedings of the 10th annual joint conference on Digital libraries
Predicting searcher frustration
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
From fusion to re-ranking: a semantic approach
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A knowledge-based model using ontologies for personalized web information gathering
Web Intelligence and Agent Systems
On identifying representative relevant documents
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Rank learning for factoid question answering with linguistic and semantic constraints
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Fast query expansion using approximations of relevance models
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Evaluation of axiomatic approaches to crosslanguage retrieval
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
UNIBA-SENSE @ CLEF 2009: robust WSD task
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
CLEF-IP 2009: retrieval experiments in the intellectual property domain
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Document expansion for image retrieval
RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
Document expansion based on WordNet for robust IR
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Lost in translation: authorship attribution using frame semantics
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Active learning to maximize accuracy vs. effort in interactive information retrieval
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Cluster-based fusion of retrieved lists
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Quantifying test collection quality based on the consistency of relevance judgements
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Re-ranking search results using an additional retrieved list
Information Retrieval
Model-based inference about IR systems
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Efficiency optimizations for interpolating subqueries
Proceedings of the 20th ACM international conference on Information and knowledge management
Multiple testing in statistical analysis of systems-based information retrieval experiments
ACM Transactions on Information Systems (TOIS)
Effective query formulation with multiple information sources
Proceedings of the fifth ACM international conference on Web search and data mining
Measuring the variability in effectiveness of a retrieval system
IRFC'10 Proceedings of the First international Information Retrieval Facility conference on Adbances in Multidisciplinary Retrieval
Evaluation with informational and navigational intents
Proceedings of the 21st international conference on World Wide Web
See what's enBlogue: real-time emergent topic identification in social media
Proceedings of the 15th International Conference on Extending Database Technology
Result disambiguation in web people search
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Recommending source code for use in rapid software prototypes
Proceedings of the 34th International Conference on Software Engineering
Search, interrupted: understanding and predicting search task continuation
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Modeling higher-order term dependencies in information retrieval using query hypergraphs
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Experimental methods for information retrieval
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Two-part segmentation of text documents
Proceedings of the 21st ACM international conference on Information and knowledge management
Quality models for microblog retrieval
Proceedings of the 21st ACM international conference on Information and knowledge management
Differences in effectiveness across sub-collections
Proceedings of the 21st ACM international conference on Information and knowledge management
Temporal models for microblogs
Proceedings of the 21st ACM international conference on Information and knowledge management
Generating pseudo test collections for learning to rank scientific articles
CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
Proxemic conceptual network based on ontology enrichment for representing documents in IR
EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
Image and Vision Computing
Application of Text Summarization techniques to the Geographical Information Retrieval task
Expert Systems with Applications: An International Journal
An evaluation of labelling-game data for video retrieval
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Query suggestions for textual problem solution repositories
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Typing candidate answers using type coercion
IBM Journal of Research and Development
Deciding on an adjustment for multiplicity in IR experiments
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Modeling term dependencies with quantum language models for IR
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Flat vs. hierarchical phrase-based translation models for cross-language information retrieval
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Interesting event detection through hall of fame rankings
Proceedings of the ACM SIGMOD Workshop on Databases and Social Networks
Revisiting Exhaustivity and Specificity Using Propositional Logic and Lattice Theory
Proceedings of the 2013 Conference on the Theory of Information Retrieval
Clustering-based transduction for learning a ranking model with limited human labels
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Graph-of-word and TW-IDF: new approach to ad hoc IR
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Improving pseudo-relevance feedback via tweet selection
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Portfolio: Searching for relevant functions and their usages in millions of lines of code
ACM Transactions on Software Engineering and Methodology (TOSEM) - Testing, debugging, and error handling, formal methods, lifecycle concerns, evolution and maintenance
Effective and Robust Query-Based Stemming
ACM Transactions on Information Systems (TOIS)
Framing image description as a ranking task: data, models and evaluation metrics
Journal of Artificial Intelligence Research
Text mining in negative relevance feedback
Web Intelligence and Agent Systems
Evaluation in Music Information Retrieval
Journal of Intelligent Information Systems
Hi-index | 0.00 |
Information retrieval (IR) researchers commonly use three tests of statistical significance: the Student's paired t-test, the Wilcoxon signed rank test, and the sign test. Other researchers have previously proposed using both the bootstrap and Fisher's randomization (permutation) test as non-parametric significance tests for IR but these tests have seen little use. For each of these five tests, we took the ad-hoc retrieval runs submitted to TRECs 3 and 5-8, and for each pair of runs, we measured the statistical significance of the difference in their mean average precision. We discovered that there is little practical difference between the randomization, bootstrap, and t tests. Both the Wilcoxon and sign test have a poor ability to detect significance and have the potential to lead to false detections of significance. The Wilcoxon and sign tests are simplified variants of the randomization test and their use should be discontinued for measuring the significance of a difference between means.