Quality models for microblog retrieval

Authors:
Jaeho Choi;W. Bruce Croft;Jin Young Kim
Affiliations:
NHN Corporation, Seongnam, South Korea;University of Massachusetts Amherst, Amherst, MA, USA;University of Massachusetts Amherst, Amherst, MA, USA
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 18
Cited 3

A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Document quality models for web ad hoc retrieval

Proceedings of the 14th ACM international conference on Information and knowledge management
Linear feature-based models for information retrieval

Information Retrieval
A comparison of statistical significance tests for information retrieval evaluation

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Unsupervised estimation of dirichlet smoothing parameters

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
An empirical study on learning to rank of tweets

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
#TwitterSearch: a comparison of microblog search and web search

Proceedings of the fourth ACM international conference on Web search and data mining
Quality-biased ranking of web documents

Proceedings of the fourth ACM international conference on Web search and data mining
Predicting popular messages in Twitter

Proceedings of the 20th international conference companion on World wide web
Information credibility on twitter

Proceedings of the 20th international conference on World wide web
Incorporating query expansion and quality indicators in searching microblog posts

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Searching microblogs: coping with sparsity and document quality

Proceedings of the 20th ACM international conference on Information and knowledge management
Retweet Modeling Using Conditional Random Fields

ICDMW '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops

The Impacts of Structural Difference and Temporality of Tweets on Retrieval Effectiveness

ACM Transactions on Information Systems (TOIS)
RAProp: ranking tweets by exploiting the tweet/user/web ecosystem and inter-tweet agreement

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Retrieving opinions from discussion forums

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Microblog services typically contain very short documents (e.g., tweets) containing comments about the latest news and events. Many of these documents are not informative or have very little content due to their personal and ephemeral nature. Providing effective retrieval in a microblog service will require addressing the challenge of distinguishing the high-quality, informative documents from the others. Recent work has focused on finding features that indicate the quality of microblog documents, but the impact these quality features on retrieval is not clear. In this paper, we suggest a low-cost quality model using surrogate judgments based on user behavior (i.e., retweets) that can be collected automatically. We analyze the relationship between document informativeness and relevance judgments for microblog retrieval. Then we demonstrate that our behavior-based quality metric has a high correlation with manual judgments. Also, we perform experiments to study the impact of the quality model on microblog retrieval. The results based on the TREC Microblog track show that the proposed quality model, combined with a variety of retrieval models, can improve retrieval performance and is competitive with a model trained using manual relevance judgments.