Text representation using dependency tree subgraphs for sentiment analysis

  • Authors:
  • Alexander Pak;Patrick Paroubek

  • Affiliations:
  • Université de Paris-Sud, Laboratoire LIMSI-CNRS, Orsay Cedex, France;Université de Paris-Sud, Laboratoire LIMSI-CNRS, Orsay Cedex, France

  • Venue:
  • DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A standard approach for supervised sentiment analysis with n-grams features cannot correctly identify complex sentiment expressions due to the loss of information when representing a text using the bag-of-words model. In our research, we propose to use subgraphs from the dependency tree of a parsed sentence as features for sentiment classification. We represent a text with a feature vector based on extracted subgraphs and use state of the art SVM classifier to identify the polarity of the given text. Our experimental evaluations on the movie-review dataset show that using our proposed features outperforms the standard bag-of-words and n-gram models. In this paper, we work with English, however most of our techniques can be easily adapted for other languages.