Pseudo test collections for training and tuning microblog rankers

Authors:
Richard Berendsen;Manos Tsagkias;Wouter Weerkamp;Maarten de Rijke
Affiliations:
University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands;University of Amsterdam, Amsterdam, Netherlands
Venue:
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Year:
2013

Citing 32
Cited 0

Improving relevance feedback in the vector space model

CIKM '97 Proceedings of the sixth international conference on Information and knowledge management
Foundations of statistical natural language processing

Foundations of statistical natural language processing
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Simulation of user judgments in bibliographic retrieval systems

SIGIR '81 Proceedings of the 4th annual international ACM SIGIR conference on Information storage and retrieval: theoretical issues in information retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness

ACM Transactions on Information Systems (TOIS)
Problems in the simulation of bibliographic retrieval systems

SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
Minimal test collections for retrieval evaluation

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
On rank-based effectiveness measures and optimization

Information Retrieval
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
Building simulated queries for known-item topics: an analysis using six european languages

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Incorporating term dependency in the dfr framework

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Quantifying query ambiguity

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Evaluation over thousands of queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Search Engines: Information Retrieval in Practice

Search Engines: Information Retrieval in Practice
Bridging Language Modeling and Divergence from Randomness Models: A Log-Logistic Model for IR

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Learning to Rank for Information Retrieval

Foundations and Trends in Information Retrieval
Retrieval experiments using pseudo-desktop collections

Proceedings of the 18th ACM conference on Information and knowledge management
What is Twitter, a social network or a news media?

Proceedings of the 19th international conference on World wide web
Comparing click-through data to purchase decisions for retrieval evaluation

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Validating query simulators: an experiment using commercial searches and purchases

CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
#TwitterSearch: a comparison of microblog search and web search

Proceedings of the fourth ACM international conference on Web search and data mining
Information search and retrieval in microblogs

Journal of the American Society for Information Science and Technology
Incorporating query expansion and quality indicators in searching microblog posts

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Estimation methods for ranking recent information

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Pseudo test collections for learning web search ranking functions

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Searching microblogs: coping with sparsity and document quality

Proceedings of the 20th ACM international conference on Information and knowledge management
A nugget-based test collection construction paradigm

Proceedings of the 20th ACM international conference on Information and knowledge management
A study of blog search

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Credibility-inspired ranking for blog post retrieval

Information Retrieval
Generating pseudo test collections for learning to rank scientific articles

CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
Microblog language identification: overcoming the limitations of short, unedited and idiomatic text

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent years have witnessed a persistent interest in generating pseudo test collections, both for training and evaluation purposes. We describe a method for generating queries and relevance judgments for microblog search in an unsupervised way. Our starting point is this intuition: tweets with a hashtag are relevant to the topic covered by the hashtag and hence to a suitable query derived from the hashtag. Our baseline method selects all commonly used hashtags, and all associated tweets as relevance judgments; we then generate a query from these tweets. Next, we generate a timestamp for each query, allowing us to use temporal information in the training process. We then enrich the generation process with knowledge derived from an editorial test collection for microblog search. We use our pseudo test collections in two ways. First, we tune parameters of a variety of well known retrieval methods on them. Correlations with parameter sweeps on an editorial test collection are high on average, with a large variance over retrieval algorithms. Second, we use the pseudo test collections as training sets in a learning to rank scenario. Performance close to training on an editorial test collection is achieved in many cases. Our results demonstrate the utility of tuning and training microblog search algorithms on automatically generated training material.