Short text classification in twitter to improve information filtering

  • Authors:
  • Bharath Sriram;Dave Fuhry;Engin Demir;Hakan Ferhatosmanoglu;Murat Demirbas

  • Affiliations:
  • Ohio State University, Columbus, OH, USA;Ohio State University, Columbus, OH, USA;Ohio State University, Columbus, OH, USA;Ohio State University, Columbus, OH, USA;University at Buffalo, Suny, NY, USA

  • Venue:
  • Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In microblogging services such as Twitter, the users may become overwhelmed by the raw data. One solution to this problem is the classification of short text messages. As short texts do not provide sufficient word occurrences, traditional classification methods such as "Bag-Of-Words" have limitations. To address this problem, we propose to use a small set of domain-specific features extracted from the author's profile and text. The proposed approach effectively classifies the text to a predefined set of generic classes such as News, Events, Opinions, Deals, and Private Messages.