Emotion tokens: bridging the gap among multilingual twitter sentiment analysis

  • Authors:
  • Anqi Cui;Min Zhang;Yiqun Liu;Shaoping Ma

  • Affiliations:
  • State Key Laboratory of Intelligent Technology and Systems, Tsinghua Univ., Beijing, China;State Key Laboratory of Intelligent Technology and Systems, Tsinghua Univ., Beijing, China;State Key Laboratory of Intelligent Technology and Systems, Tsinghua Univ., Beijing, China;State Key Laboratory of Intelligent Technology and Systems, Tsinghua Univ., Beijing, China

  • Venue:
  • AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Twitter is a microblogging service where worldwide users publish their feelings. However, sentiment analysis for Twitter messages (tweets) is regarded as a challenging problem because tweets are short and informal. In this paper, we focus on this problem by the analysis of emotion tokens, including emotion symbols (e.g. emoticons), irregular forms of words and combined punctuations. According to our observation on five million tweets, these emotion tokens are commonly used (0.47 emotion tokens per tweet). They directly express one's emotion regardless of his language; hence become a useful signal for sentiment analysis on multilingual tweets. Firstly, emotion tokens are extracted automatically from tweets. Secondly, a graph propagation algorithm is proposed to label the tokens' polarities. Finally, a multilingual sentiment analysis algorithm is introduced. Comparative evaluations are conducted among semantic lexicon based approach and some state-of-the-art Twitter sentiment analysis Web services, both on English and non-English tweets. Experimental results show effectiveness of the proposed algorithms.