Statistical Translation Language Model for Twitter Search

  • Authors:
  • Maryam Karimzadehgan;ChengXiang Zhai;Miles Efron

  • Affiliations:
  • Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801;Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801;School of Library and Information Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801

  • Venue:
  • Proceedings of the 2013 Conference on the Theory of Information Retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the prevalence of social media applications, an increasing number of internet users are actively publishing text information on-line. This influx provides a wealth of text information on those users. Ranking in social media poses different challenges than Web search ranking, one of which is that Microblog messages are really short. As a result, the vocabulary mismatch problem is exacerbated in social media search. In this paper, we first study the standard translation model for this problem and reveal that translation language model not only helps to bridge the vocabulary gap but also improves the estimate of Term Frequency. We further propose two ways to improve translation language model through leveraging Hashtag information and adaptively setting the self-translation parameter. Experimental results on Twitter data set show that our proposed methods are effective.