Cross-Domain Contextualization of Sentiment Lexicons

  • Authors:
  • Stefan Gindl;Albert Weichselbraun;Arno Scharl

  • Affiliations:
  • MODUL University Vienna, Austria, Department of New Media Technology, email: {stefan.gindl,arno.scharl}@modul.ac.at;Vienna University of Economics and Business, Austria, Department of Information Systems and Operations, email: albert.weichselbraun@wu.ac.at;MODUL University Vienna, Austria, Department of New Media Technology, email: {stefan.gindl,arno.scharl}@modul.ac.at

  • Venue:
  • Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The simplicity of using Web 2.0 platforms and services has resulted in an abundance of user-generated content. A significant part of this content contains user opinions with clear economic relevance-customer and travel reviews, for example, or the articles of well-known and respected bloggers who influence purchase decisions. Analyzing and acting upon user-generated content is becoming imperative for marketers and social scientists who aim to gather feedback from very large user communities. Sentiment detection, as part of opinion mining, supports these efforts by identifying and aggregating polar opinions-i.e., positive or negative statements about facts. For achieving accurate results, sentiment detection requires a correct interpretation of language, which remains a challenging task due to the inherent ambiguities of human languages. Particular attention has to be directed to the context of opinionated terms when trying to resolve these ambiguities. Contextualized sentiment lexicons address this need by considering the sentiment term's context in their evaluation but are usually limited to one domain, as many contextualizations are not stable across domains. This paper introduces a method which identifies unstable contextualizations and refines the contextualized sentiment dictionaries accordingly, eliminating the need for specific training data for each individual domain. An extensive evaluation compares the accuracy of this approach with results obtained from domain-specific corpora.