Quality models for microblog retrieval

  • Authors:
  • Jaeho Choi;W. Bruce Croft;Jin Young Kim

  • Affiliations:
  • NHN Corporation, Seongnam, South Korea;University of Massachusetts Amherst, Amherst, MA, USA;University of Massachusetts Amherst, Amherst, MA, USA

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Microblog services typically contain very short documents (e.g., tweets) containing comments about the latest news and events. Many of these documents are not informative or have very little content due to their personal and ephemeral nature. Providing effective retrieval in a microblog service will require addressing the challenge of distinguishing the high-quality, informative documents from the others. Recent work has focused on finding features that indicate the quality of microblog documents, but the impact these quality features on retrieval is not clear. In this paper, we suggest a low-cost quality model using surrogate judgments based on user behavior (i.e., retweets) that can be collected automatically. We analyze the relationship between document informativeness and relevance judgments for microblog retrieval. Then we demonstrate that our behavior-based quality metric has a high correlation with manual judgments. Also, we perform experiments to study the impact of the quality model on microblog retrieval. The results based on the TREC Microblog track show that the proposed quality model, combined with a variety of retrieval models, can improve retrieval performance and is competitive with a model trained using manual relevance judgments.