Revised mutual information approach for german text sentiment classification

  • Authors:
  • Farag Saad;Brigitte Mathiak

  • Affiliations:
  • GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany;GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany

  • Venue:
  • Proceedings of the 22nd international conference on World Wide Web companion
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The significant increase in content of online social media such as product reviews, blogs, forums etc., have led to an increasing attention to sentiment analysis tools and approaches that make use of mining this substantially growing content. The aim of this paper is to develop a robust classification approach of customer reviews based on a self-annotated domain-specific corpus by applying a statistical approach i.e., mutual information. First, subjective words in each test sentence are identified. Second, ambiguous adjectives such as high, low, large, many etc., are disambiguated based on their accompanying noun using a conditional mutual information approach. Third, a mutual information approach is applied to find the sentiment orientation (polarity) of the identified subjective words based on analyzing their statistical relationship with the manually annotated sentiment labels within a sizeable sentiment training data. Fourth, since negation plays a significant role in flipping the sentiment polarity of an identified sentiment word, we estimate the role of negation in affecting the classification accuracy. Finally, the identified polarity for each test sentence is evaluated against experts' annotation.