A cross-corpus study of unsupervised subjectivity identification based on calibrated EM

  • Authors:
  • Dong Wang;Yang Liu

  • Affiliations:
  • The University of Texas at Dallas;The University of Texas at Dallas

  • Venue:
  • WASSA '11 Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this study we investigate using an unsupervised generative learning method for subjectivity detection in text across different domains. We create an initial training set using simple lexicon information, and then evaluate a calibrated EM (expectation-maximization) method to learn from unannotated data. We evaluate this unsupervised learning approach on three different domains: movie data, news resource, and meeting dialogues. We also perform a thorough analysis to examine impacting factors on unsupervised learning, such as the size and self-labeling accuracy of the initial training set. Our experiments and analysis show inherent differences across domains and performance gain from calibration in EM.