Automatic Domain-Specific Sentiment Lexicon Generation with Label Propagation

  • Authors:
  • Yen-Jen Tai;Hung-Yu Kao

  • Affiliations:
  • Department of Computer Science and Information Engineering, National Cheng Kung University Tainan, Taiwan, R.O.C.;Department of Computer Science and Information Engineering, National Cheng Kung University Tainan, Taiwan, R.O.C.

  • Venue:
  • Proceedings of International Conference on Information Integration and Web-based Applications & Services
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, the advance of social media has led to the explosive growth of opinion data. Therefore, sentiment analysis has attracted a lot of attentions. Currently, sentiment analysis applications are divided into two main approaches, the lexicon-based approach and the machine-learning approach. However, both of them face the challenge of obtaining a large amount of human-labeled training data and corpus. For the lexicon-based approach, it requires a sentiment lexicon to determine the opinion polarity. There are many existing benchmark sentiment lexicons, but they cannot cover all the domain-specific words meanings. Thus, automatic generation of a domain-specific sentiment lexicon becomes an important task. We propose a framework to automatically generate sentiment lexicon. First, we determine the semantic similarity between two words in the entire unlabeled corpus. We treat the words as nodes and similarities as weighted edges to construct word graphs. A graph-based semi-supervised label propagation method finally assigns the polarity to unlabeled words through the proposed propagation process. Experiments conducted on the microblog data, Twitter, show that our approach leads to a better performance than baseline approaches and general-purpose sentiment dictionaries.