Building simulated queries for known-item topics: an analysis using six european languages

Authors:
Leif Azzopardi;Maarten de Rijke;Krisztian Balog
Affiliations:
University of Glasgow, Glasgow, United Kingdom;University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands
Venue:
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2007

Citing 16
Cited 21

Information retrieval as statistical translation

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Query-based sampling of text databases

ACM Transactions on Information Systems (TOIS)
Automatic query expansion based on divergence

Proceedings of the tenth international conference on Information and knowledge management
Simulation of user judgments in bibliographic retrieval systems

SIGIR '81 Proceedings of the 4th annual international ACM SIGIR conference on Information storage and retrieval: theoretical issues in information retrieval
Theory of Modelling and Simulation

Theory of Modelling and Simulation
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Searching web databases by structuring keyword-based queries

Proceedings of the eleventh international conference on Information and knowledge management
Problems in the simulation of bibliographic retrieval systems

SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
The Philosophy of Information Retrieval Evaluation

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
Monolingual Document Retrieval for European Languages

Information Retrieval
Retrieving lightly annotated images using image similarities

Proceedings of the 2005 ACM symposium on Applied computing
Using controlled query generation to evaluate blind relevance feedback algorithms

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Type less, find more: fast autocompletion search with a succinct index

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic construction of known-item finding test beds

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Overview of WebCLEF 2005

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Overview of WebCLEF 2006

CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval

A study of query length

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Comparative analysis of clicks and judgments for IR evaluation

Proceedings of the 2009 workshop on Web Search Click Data
Adaptation of offline vertical selection predictions in the presence of user feedback

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Query side evaluation: an empirical analysis of effectiveness and effort

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A signal-to-noise approach to score normalization

Proceedings of the 18th ACM conference on Information and knowledge management
Retrieval experiments using pseudo-desktop collections

Proceedings of the 18th ACM conference on Information and knowledge management
Ranking using multiple document types in desktop search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
PRES: a score metric for evaluating recall-oriented information retrieval applications

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Validating query simulators: an experiment using commercial searches and purchases

CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
What makes re-finding information difficult? a study of email re-finding

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
The economics in interactive information retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Seeding simulated queries with user-study data for personal search evaluation

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Cognitive processes in query generation

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Identifying controversial issues and their sub-topics in news articles

PAISI'10 Proceedings of the 2010 Pacific Asia conference on Intelligence and Security Informatics
Generating queries from user-selected text

Proceedings of the 4th Information Interaction in Context Symposium
Towards realistic known-item topics for the ClueWeb

Proceedings of the 4th Information Interaction in Context Symposium
Generating pseudo test collections for learning to rank scientific articles

CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
PROMISE retreat report prospects and opportunities for information access evaluation

ACM SIGIR Forum
Pseudo test collections for training and tuning microblog rankers

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Click model-based information retrieval metrics

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

There has been increased interest in the use of simulated queries for evaluation and estimation purposes in Information Retrieval. However, there are still many unaddressed issues regarding their usage and impact on evaluation because their quality, in terms of retrieval performance, is unlike real queries. In this paper, wefocus on methods for building simulated known-item topics and explore their quality against real known-item topics. Using existing generation models as our starting point, we explore factors which may influence the generation of the known-item topic. Informed by this detailed analysis (on six European languages) we propose a model with improved document and term selection properties, showing that simulated known-item topics can be generated that are comparable to real known-item topics. This is a significant step towards validating the potential usefulness of simulated queries: for evaluation purposes, and becausebuilding models of querying behavior provides a deeper insight into the querying process so that better retrieval mechanisms can be developed to support the user.