Using statistical testing in the evaluation of retrieval experiments

Authors:
David Hull
Affiliations:
-
Venue:
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
1993

Citing 8
Cited 148

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
A comparison of Two-Poisson, inverse document frequency and discrimination value models of document representation

Information Processing and Management: an International Journal
Determining the effectiveness of retrieval algorithms

Information Processing and Management: an International Journal
Incremental relevance feedback

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
The state of retrieval system evaluation

Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
The pragmatics of information retrieval experimentation, revisited

Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Presenting results of experimental retrieval comparisons

Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Information Retrieval

Information Retrieval

Inferring probability of relevance using the method of logistic regression

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Improving text retrieval for the routing problem using latent semantic indexing

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A comparison of classifiers and document representations for the routing problem

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Viewing stemming as recall enhancement

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Querying across languages: a dictionary-based approach to multilingual information retrieval

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Method combination for document filtering

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A deductive data model for query expansion

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus-based stemming using cooccurrence of word variants

ACM Transactions on Information Systems (TOIS)
Effective retrieval with distributed collections

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The impact of query structure and query expansion on retrieval performance

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Comparing interactive information retrieval systems across sites: the TREC-6 interactive track matrix experiment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
From reading to retrieval: freeform ink annotations as queries

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based language models for distributed retrieval

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Improving the effectiveness of information retrieval with local context analysis

ACM Transactions on Information Systems (TOIS)
Evaluating evaluation measure stability

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
The impact of database selection on distributed searching

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Personalization of search engine services for effective retrieval and knowledge management

ICIS '00 Proceedings of the twenty first international conference on Information systems
Improving query translation for cross-language information retrieval using statistical models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating a probabilistic model for cross-lingual information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic query expansion using query logs

Proceedings of the 11th international conference on World Wide Web
Statistical cross-language information retrieval using n-best query translations

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Empirical studies in strategies for Arabic retrieval

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
On arabic search: improving the retrieval effectiveness via a light stemming approach

Proceedings of the eleventh international conference on Information and knowledge management
Exploiting Hierarchy in Text Categorization

Information Retrieval
The Co-Effects of Query Structure and Expansion on RetrievalPerformance in Probabilistic Text Retrieval

Information Retrieval
Collaborative Learning of Term-Based Concepts for Automatic Query Expansion

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Improving Document Retrieval by Automatic Query Expansion Using Collaborative Learning of Term-Based Concepts

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Passage-Based Document Retrieval as a Tool for Text Mining with User's Information Needs

DS '01 Proceedings of the 4th International Conference on Discovery Science
Applying Machine Translation to Two-Stage Cross-Language Information

AMTA '00 Proceedings of the 4th Conference of the Association for Machine Translation in the Americas on Envisioning Machine Translation in the Information Future
Cross-language information retrieval: experiments based on CLEF 2000 corpora

Information Processing and Management: an International Journal
User-oriented evaluation methods for information retrieval: a case study based on conceptual models for query expansion

Exploring artificial intelligence in the new millennium
An empirical study on retrieval models for different document genres: patents and newspaper articles

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Result merging strategies for a current news metasearcher

Information Processing and Management: an International Journal
Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval

Information Retrieval
Multilingual Information Retrieval Using Machine Translation, Relevance Feedback and Decompounding

Information Retrieval
How Effective is Stemming and Decompounding for German Text Retrieval?

Information Retrieval
Embedding web-based statistical translation models in cross-language information retrieval

Computational Linguistics - Special issue on web as corpus
Chinese word segmentation and its effect on information retrieval

Information Processing and Management: an International Journal
Automatic performance evaluation of web search engines

Information Processing and Management: an International Journal
Evaluation of an extraction-based approach to answering definitional questions

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Relevancy based semantic interoperation of reuse repositories

Proceedings of the 12th ACM SIGSOFT twelfth international symposium on Foundations of software engineering
A probabilistic model for stemmer generation

Information Processing and Management: an International Journal - Special issue: An Asian digital libraries perspective
Feature selection algorithms for the generation of multiple classifier systems and their application to handwritten word recognition

Pattern Recognition Letters
Empirical studies on the impact of lexical resources on CLIR performance

Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Binary and graded relevance in IR evaluations: comparison of the effects on ranking of IR systems

Information Processing and Management: an International Journal
Information retrieval system evaluation: effort, sensitivity, and reliability

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Question answering passage retrieval using dependency relations

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Bibliographic database access using free-text and controlled vocabulary: an evaluation

Information Processing and Management: an International Journal
Cross-lingual information retrieval using hidden Markov models

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Learning bilingual translations from comparable corpora to cross-language information retrieval: hybrid statistics-based and linguistics-based approach

AsianIR '03 Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
Evaluating patent retrieval in the third NTCIR workshop

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
SDQE: towards automatic semantic query optimization in P2P systems

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Evaluating evaluation metrics based on the bootstrap

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical precision of information retrieval evaluation

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Is 1 noun worth 2 adjectives?: measuring relative feature utility

Information Processing and Management: an International Journal
Behavior modeling using bigram frequencies for client-side link prefetching

IMSA'06 Proceedings of the 24th IASTED international conference on Internet and multimedia systems and applications
Investigating the exhaustivity dimension in content-oriented XML element retrieval evaluation

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
eXtended cumulated gain measures for the evaluation of content-oriented XML retrieval

ACM Transactions on Information Systems (TOIS)
User behavior modeling and content based speculative web page prefetching

Data & Knowledge Engineering - Special issue: ER 2003
On the reliability of information retrieval metrics based on graded relevance

Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
An analysis of two approaches in information retrieval: From frameworks to study designs

Journal of the American Society for Information Science and Technology
Web object retrieval

Proceedings of the 16th international conference on World Wide Web
Combining association measures for collocation extraction

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
On rank correlation in information retrieval evaluation

ACM SIGIR Forum
Cross-lingual query suggestion using query logs of different languages

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Indexing confusion networks for morph-based spoken document retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Restricted inflectional form generation in management of morphological keyword variation

Information Retrieval
A formal model of annotations of digital content

ACM Transactions on Information Systems (TOIS)
Evaluation of retrieval effectiveness with incomplete relevance data: Theoretical and experimental comparison of three measures

Information Processing and Management: an International Journal
A comparison of statistical significance tests for information retrieval evaluation

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Evaluating epistemic uncertainty under incomplete assessments

Information Processing and Management: an International Journal
Unified relevance models for rating prediction in collaborative filtering

ACM Transactions on Information Systems (TOIS)
Doxels in context for retrieval: from structure to neighbours

Proceedings of the 2008 ACM symposium on Applied computing
Modeling anchor text and classifying queries to enhance web document retrieval

Proceedings of the 17th international conference on World Wide Web
Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web

Information Processing and Management: an International Journal
Probabilistic relevance ranking for collaborative filtering

Information Retrieval
Combining structure and function-based descriptors for component retrieval in software digital libraries

Integrated Computer-Aided Engineering
Using co-occurrence models for placename disambiguation

International Journal of Geographical Information Science
A comparison of geometric approaches to assessing spatial similarity for GIR

International Journal of Geographical Information Science
Exploring knowledge of sub-domain in a multi-resolution bootstrapping framework for concept detection in news video

MM '08 Proceedings of the 16th ACM international conference on Multimedia
A Method for Query Expansion Using the Related Word Extraction Algorithm

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Uma abordagem efetiva e eficiente para deduplicação de metadados bibliográficos de objetos digitais

SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Concept unification of terms in different languages via web mining for Information Retrieval

Information Processing and Management: an International Journal
Refining component description by leveraging user query logs

Journal of Systems and Software
Hybrid Adaptive Web Service Selection with SAWSDL-MX and WSDL-Analyzer

ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
Personalized text snippet extraction using statistical language models

Pattern Recognition
Using the taxonomy of cognitive learning to model online searching

Information Processing and Management: an International Journal
Extending lexical association measures for collocation extraction

Computer Speech and Language
Analogical Dissimilarity

Journal of Artificial Intelligence Research
Personalization of tagging systems

Information Processing and Management: an International Journal
Minimum rank error language modeling

IEEE Transactions on Audio, Speech, and Language Processing
Enhancing search results of concept annotated documents

IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Language Models of Collaborative Filtering

AIRS '09 Proceedings of the 5th Asia Information Retrieval Symposium on Information Retrieval Technology
Binary and graded relevance in IR evaluations-Comparison of the effects on ranking of IR systems

Information Processing and Management: an International Journal
Evaluating patent retrieval in the third NTCIR workshop

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
SDQE: towards automatic semantic query optimization in P2P systems

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Language models for web object retrieval

WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Exploiting query logs for cross-lingual query suggestions

ACM Transactions on Information Systems (TOIS)
Stemming and decompounding for German text retrieval

ECIR'03 Proceedings of the 25th European conference on IR research
How to compare bilingual to monolingual cross-language information retrieval

ECIR'07 Proceedings of the 29th European conference on IR research
Popularity weighted ranking for academic digital libraries

ECIR'07 Proceedings of the 29th European conference on IR research
Overall comparison at the standard levels of recall of multiple retrieval methods with the Friedman test

ECIR'07 Proceedings of the 29th European conference on IR research
Relevance measurement on chinese search results

HCI'07 Proceedings of the 12th international conference on Human-computer interaction: applications and services
The importance of scientific data curation for evaluation campaigns

DELOS'07 Proceedings of the 1st international conference on Digital libraries: research and development
Geographic and textual data fusion in Forostar

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
PRES: a score metric for evaluating recall-oriented information retrieval applications

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
On the query reformulation technique for effective MEDLINE document retrieval

Journal of Biomedical Informatics
Recommendation in Internet forums and blogs

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Simultaneous estimation of chords and musical context from audio

IEEE Transactions on Audio, Speech, and Language Processing
User comments for news recommendation in forum-based social media

Information Sciences: an International Journal
Using textual and structural context for searching Multimedia Elements

International Journal of Business Intelligence and Data Mining
Exploring structured documents and query formulation techniques for patent retrieval

CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Overview and results of Morpho challenge 2009

CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Measuring effectiveness of geographic IR systems in digital libraries: evaluation framework and case study

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Tie-breaking bias: effect of an uncontrolled parameter on information retrieval evaluation

CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Research methodology in studies of assessor effort for information retrieval evaluation

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Accuracy of inter-researcher similarity measures based on topical and social clues

Scientometrics
Lexical and Syntactic knowledge for Information Retrieval

Information Processing and Management: an International Journal
The effect of specialized multimedia collections on web searching

Journal of Web Engineering
Speech retrieval from unsegmented finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval

ACM Transactions on Speech and Language Processing (TSLP)
Fusing different information retrieval systems according to query-topics: a study based on correlation in information retrieval systems and TREC topics

Information Retrieval
Multiple testing in statistical analysis of systems-based information retrieval experiments

ACM Transactions on Information Systems (TOIS)
CLEF 2005: ad hoc track overview

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Adapting document ranking to users’ preferences using click-through data

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
The Brand Effect of Key Phrases and Advertisements in Sponsored Search

International Journal of Electronic Commerce
A probabilistic interpretation of precision, recall and F-score, with implication for evaluation

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
CLEF 2004: ad hoc track overview and results analysis

CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images
Data fusion for effective european monolingual information retrieval

CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images
Measuring the variability in effectiveness of a retrieval system

IRFC'10 Proceedings of the First international Information Retrieval Facility conference on Adbances in Multidisciplinary Retrieval
A user-item relevance model for log-based collaborative filtering

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Feature selection: a useful preprocessing step

IRSG'97 Proceedings of the 19th Annual BCS-IRSG conference on Information Retrieval Research
Translation techniques in cross-language information retrieval

ACM Computing Surveys (CSUR)
A new approach for measuring the value of patents based on structural indicators for ego patent citation networks

Journal of the American Society for Information Science and Technology
Combination of document structure and links for multimedia object retrieval

Journal of Information Science
Scientific data of an evaluation campaign: do we properly deal with them?

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
CLEF 2006: ad hoc track overview

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
A study on the use of stemming for monolingual ad-hoc Portuguese information retrieval

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
GeoCLEF 2006: the CLEF 2006 cross-language geographic information retrieval track overview

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Forostar: a system for GIR

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Evaluating the performance of demographic targeting using gender in sponsored search

Information Processing and Management: an International Journal
Using profile expansion techniques to alleviate the new user problem

Information Processing and Management: an International Journal
An information-theoretic account of static index pruning

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
SMARTMUSEUM: A mobile recommender system for the Web of Data

Web Semantics: Science, Services and Agents on the World Wide Web
Information quality measurement of medical encoding support based on usability

Computer Methods and Programs in Biomedicine
Effective ranking and search techniques for Web resources considering semantic relationships

Information Processing and Management: an International Journal
A nonparametric term weighting method for information retrieval based on measuring the divergence from independence

Information Retrieval
Adaptive signature-based semantic selection of services with OWLS-MX3

Multiagent and Grid Systems - Development of service-based and agent-based computing systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The standard strategies for evaluation based on precision and recall are examined and their relative advantages and disadvantages are discussed. In particular, it is suggested that relevance feedback be evaluated from the perspective of the user. A number of different statistical tests are described for determining if differences in performance between retrieval methods are significant. These tests have often been ignored in the past because most are based on an assumption of normality which is not strictly valid for the standard performance measures. However, one can test this assumption using simple diagnostic plots, and if it is a poor approximation, there are a number of non-parametric alternatives.