Sentiment analysis with a multilingual pipeline

  • Authors:
  • Daniella Bal;Malissa Bal;Arthur Van Bunningen;Alexander Hogenboom;Frederik Hogenboom;Flavius Frasincar

  • Affiliations:
  • Erasmus University Rotterdam, Rotterdam, The Netherlands;Erasmus University Rotterdam, Rotterdam, The Netherlands;Teezir BV, Utrecht, The Netherlands;Erasmus University Rotterdam, Rotterdam, The Netherlands;Erasmus University Rotterdam, Rotterdam, The Netherlands;Erasmus University Rotterdam, Rotterdam, The Netherlands

  • Venue:
  • WISE'11 Proceedings of the 12th international conference on Web information system engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sentiment analysis refers to retrieving an author's sentiment from a text. We analyze the differences that occur in sentiment scoring across languages. We present our experiments for the Dutch and English language based on forum, blog, news and social media texts available on the Web, where we focus on the differences in the use of a language and the effect of the grammar of a language on sentiment analysis. We propose a multilingual pipeline for evaluating how an author's sentiment is conveyed in different languages. We succeed in correctly classifying positive and negative texts with an accuracy of approximately 71% for English and 79% for Dutch. The evaluation of the results shows however that usage of common expressions, emoticons, slang language, irony, sarcasm, and cynicism, acronyms and different ways of negation in English prevent the underlying sentiment scores from being directly comparable.