An empirical study for determining relevant features for sentiment summarization of online conversational documents

  • Authors:
  • Gino Mangnoesing;Arthur van Bunningen;Alexander Hogenboom;Frederik Hogenboom;Flavius Frasincar

  • Affiliations:
  • Erasmus University Rotterdam, Rotterdam, The Netherlands;Teezir BV, Utrecht, The Netherlands;Erasmus University Rotterdam, Rotterdam, The Netherlands;Erasmus University Rotterdam, Rotterdam, The Netherlands;Erasmus University Rotterdam, Rotterdam, The Netherlands

  • Venue:
  • WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The phenomenon of big data makes managing, processing, and extracting valuable information from the Web an increasingly challenging task. As such, the abundance of user-generated content with opinions about products or brands requires appropriate tools in order to be able to capture consumer sentiment. Such tools can be used to aggregate content by means of sentiment summarization techniques, extracting text segments that reflect the overall sentiment of a text in a compressed form. We explore what features distinguish relevant from irrelevant text segments in terms of the extent to which they reflect the overall sentiment of conversational documents. In our empirical study on a collection of Dutch conversational documents, we find that text segments with opinions, segments with arguments supporting these opinions, segments discussing aspects of the subject of a text, and relatively long sentences are key indicators for text segments that summarize the sentiment conveyed by a text as a whole.