"The Godfather” vs. "Chaos”: Comparing Linguistic Analysis Based on On-line Knowledge Sources and Bags-of-N-Grams for Movie Review Valence Estimation

  • Authors:
  • Björn Schuller;Joachim Schenk;Gerhard Rigoll;Tobias Knaup

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the fields of sentiment and emotion recognition, bag of words modeling has lately become popular for the estimation of valence in text. A typical application is the evaluation of reviews of e. g. movies, music, or games. In this respect we suggest the use of back-off N-Grams as basis for a vector space construction in order to combine advantages of word-order modeling and easy integration into potential acoustic feature vectors intended for spoken document retrieval. For a fine granular estimate we consider data-driven regression next to classification based on Support Vector Machines. Alternatively the on-line knowledge sources ConceptNet, General Inquirer, and WordNet not only serve to reduce out-of-vocabulary events, but also as basis for a purely linguistic analysis. As special benefit, this approach does not demand labeled training data. A large set of 100 k movie reviews of 20 years stemming from Metacritic is utilized throughout extensive parameter discussion and comparative evaluation effectively demonstrating efficiency of the proposed methods.